Script Summary: Stupidity filtering for Youtube comments using Spambayes and Greasemonkey
I threw this together one Tuesday morning (11/13/07) using Spambayes 1.0.4 for Windows and some training data from 3 years ago (my Eudora/Spambayes days). I've since revised it to be a easier to use and train. If you want something more sophisticated, you should know that these folks are working on something.
This was tested using Firefox 220.127.116.11 on a machine running the Windows version of Spambayes 1.0.4.
1) Install the Windows client for Spambayes. Choose the server/proxy client option when it asks. This script will not work with the Outlook plugin.
2) Install this userscript
3) Browse comments on Youtube.com
4) Train the filter using "Train as Ham" and "Train as Spam" links embedded in each comment. Spammy posts will only have "Train as Ham" links and vice versa. This is by design!
5) Continue training as necessary.
I have personally found the best method is to start clean-- as opposed to importing ham/spam datasets from your email using the WebUI. Version 0.2 onward can train directly from the YouTube page (instead of having to use the WebUI) so you should be able to ramp up a useful db quickly. Also it is best to keep the # of Hams balanced with the # of Spams. You can track your status in the SpamBayes WebUI (double click the tray icon it installs).
version 0.2 (2007-11-14)
- filters all comments (not just the the first page)
- ability to train filter directly from the YouTube comment
- color coding based on score (red/yellow/green, dictated by spamLimit and hamLimit)
- shows "Train as Ham" for known spam
- shows "Train as Spam" for known non-spam
- code cleanup
version 0.1 (2007-11-13)
- proof of concept
- Bug fixes
- Extend script to work on other sites