Extra caution is recommended when installing recently uploaded/updated scripts (read more)
Be sure you trust any scripts you install

SpamBayes for Youtube (DIY Stupidity Filter)

Stupidity filtering for Youtube comments using Spambayes and Greasemonkey

About:
I threw this together one Tuesday morning (11/13/07) using Spambayes 1.0.4 for Windows and some training data from 3 years ago (my Eudora/Spambayes days). I've since revised it to be a easier to use and train. If you want something more sophisticated, you should know that these folks are working on something.

youtubayes_0_20

This was tested using Firefox 2.0.0.9 on a machine running the Windows version of Spambayes 1.0.4.

Simplified Instructions:
1) Install the Windows client for Spambayes. Choose the server/proxy client option when it asks. This script will not work with the Outlook plugin.
2) Install this userscript
3) Browse comments on Youtube.com
4) Train the filter using "Train as Ham" and "Train as Spam" links embedded in each comment. Spammy posts will only have "Train as Ham" links and vice versa. This is by design!
5) Continue training as necessary.

Notes:
I have personally found the best method is to start clean-- as opposed to importing ham/spam datasets from your email using the WebUI. Version 0.2 onward can train directly from the YouTube page (instead of having to use the WebUI) so you should be able to ramp up a useful db quickly. Also it is best to keep the # of Hams balanced with the # of Spams. You can track your status in the SpamBayes WebUI (double click the tray icon it installs).

Changelog:
version 0.2 (2007-11-14)
- filters all comments (not just the the first page)
- ability to train filter directly from the YouTube comment
- color coding based on score (red/yellow/green, dictated by spamLimit and hamLimit)
- shows "Train as Ham" for known spam
- shows "Train as Spam" for known non-spam
- code cleanup

version 0.1 (2007-11-13)
- proof of concept

Future Plans:
- Bug fixes
- Extend script to work on other sites




Nov 14, 2007
Gary Calpo Script's author

Very true. As mentioned, this was originally a simple proof of concept. A self-challenge of sorts. I have made it more user-friendly in the next revision. Thanks for your input!

 
Nov 13, 2007
akkartik User

Neat idea! But shouldn't you be making the higher scores *less* salient, not more?

You could comment on this script if you were logged in.