Checksum (MD5) hashes
Background: A cryptographic hash sum (checksum) is a small fingerprint representing a piece of data that can vary in size from zero to extremely large. It is often used to verify consistency of downloads, especially for large downloads or when dealing with multiple sources (e.g. bittorrent uses checksums on each of its packets to ensure it reassembles its files correctly, software updaters use checksums as proof against corruption, and PGP uses checksums to create cryptographic signatures).
Like fingerprints, checksums are one-way: You can generate a checksum from a file or string to compare it to one given from a trusted source, but you can't generate the data from the checksum. This is the key to the security behind much of its use, like storing passwords as checksums so that intruders and even administrators can't see the users' actual passwords (yes, this is how Firefox and almost every other software store passwords).
With respect to security, a checksum algorithm is as secure as the rarity of collisions (multiple sources that generate the same checksum). The simplest checksum still in active use (with respect to security) is MD5, which has a total output size of 2128 (though due to a complexity analysis, has a collision attack complexity of only 232, which means that you could create two random sets of data with the same MD5 sum in a few hours on a good computer, but it would still take a large cluster of computers over 50 years to generate a random piece of data that matches a real-world MD5 sum).
With respect to security of data stored with MD5 sums in Firefox, the complexity is sufficiently large enough to deter, and (IIRC) reflects the same level of complexity which protects your locally stored passwords. Your primary security comes from the fact that this data should not be accessible from the rest of the world while the secondary protection is the cryptographic complexity.
This means that it should be safe to use MD5 to store hashes relating to user's private data in a script like my Generic Autologin (w/ Password Manager) BETA.
----
That said, I chose MD5 for its simplicity. Now comes the question of how to implement it in a Greasemonkey UserScript. Since this is not a native call within JavaScript, we need to use somebody else's code.
Looking around online, I found three md5sum scripts written in JavaScript:
| Name* | Version | Date | Author | License | Lines | Size | Speed** |
|---|---|---|---|---|---|---|---|
| Frez | 1.0 | Feb 2001 | Phil Fresle + Enrico Mosanghini | ~free(?) to use as library | 155 lines | 6.2kB | 10,789ms |
| Paj | 2.1 | 1999-2002 | Paul Johnston | BSD | 256 lines | 8.7kB | 8,452ms |
| Izumo (→English) | 2.0.0 | May 13 2002 | Masanao Izumo | BSD-style | 199 lines | 5.9kB | 5,865ms |
** Speeds were determined using Measure Function Execution Times by JoeSimmons, included via @require (which actually doesn't work ... you have to add a <Require filename="md5.js" /> in gm_scripts/config.xml and then restart Firefox) and tested on that script's page with the following code (system specs*** don't matter since all values are relative):
var loops = 1000; // Don't go too high.
var sprintmap = "http://coverage.sprint.com/action/WebImageStream4?covType=sprint&mapcenterx=-12.3456789012345&mapcentery=12.34567890123456&geocenterx=-12.3&geocentery=12.3&endlinex=&endliney=&scale=250.0&width=420&height=315&showPinpoint=F&signalStrength=F&antiAlias=T&layers=TFFFTFTFFFTFFTFFTTTFFFFTFF";
var cdw = "http://www.cdw.com/shop/search/Results.aspx?filderedsortorder=priceasc&x=0&platform=all&y=0&FilteredSortOrder=PRICEASC&key=abcdefg";
var yah = "http://yp.yahoo.com/py/ypResults.py?stx=%s&stp=a&tab=B2C&addr=123+Fake+St&city=Springfield&state=UH&zip=00000-0000&uzip=00000&country=us&msa=1234&slt=12.345678&sln=-12.345678&cs=9&Submit=Search&fudge=1234";
function frez() {
MD5(location.href);
MD5(sprintmap);
MD5(cdw);
MD5(yah);
//MD5(document.body.innerHTML+document.body.innerHTML);
}
function paj() {
hex_md5(location.href);
hex_md5(sprintmap);
hex_md5(cdw);
hex_md5(yah);
//hex_md5(document.body.innerHTML+document.body.innerHTML);
}
function izumo() {
MD5_hexhash(location.href);
MD5_hexhash(sprintmap);
MD5_hexhash(cdw);
MD5_hexhash(yah);
//MD5_hexhash(document.body.innerHTML+document.body.innerHTML);
}
// DEFINE FUNCTION 1 HERE (alternated between frez and paj)
function function1() { paj(); }
// DEFINE FUNCTION 2 HERE
function function2() { izumo(); }
I initially had that commented call in there, but it took too long to evaluate, so I ran it a few times with loops=1 and found the relative times the same as with the shorter pieces. Without that last call, I ran four 1000-loop trials each of Paj vs Izumo and Frez vs Izumo, during which no function varied more than half a second from one extreme to the other and the reported difference multiplier varied by about 0.2 between extremes), posting the averages (sans-body hash) above. Since I was always comparing against Izumo, the time for Izumo is an average over its 8 trials.
Results: Izumo takes the speed crown having been 1.44 times faster than Paj and 1.84 times faster than Frez, while second-place Paj is 1.27 times faster than Frez. Happily, Izumo is also unencumbered by licenses (keep copyright intact and you should be fine, though it's not terribly clear on that) and the smallest by byte size. (Winners on each metric are marked in bold on the above table.)
Conclusion: You can use the Izumo "library" in your user script by adding this line to your metadata: @require http://www.onicos.com/staff/iz/amuse/javascript/expert/md5.txt but the process of actually getting it to work will require either downloading the file to your script's directory, editing the config.xml by hand, and restarting Firefox or installing the script on top of itself. A pain, but it works.
(System specs: Athlon X2 dual-core 4200+ (2.2GHz) with 2GB memory and a LOT of concurrently running apps with little activity (load average otherwise around 0.00) on Debian Squeeze ("Testing") with Firefox (Iceweasel) 3.0.14-1 with 35 other open tabs, with a very stale system half-recovered from some nasty network business a few months ago and an uptime of over half a year.)
Scripts mentioned in guide
- Generic Autologin (w/ Password Manager) BETA by khopesh
- Measure Function Execution Times by JoeSimmons
