Checksum algorithms
You can verify the integrity of data in many ways, however one of the most used methods of doing this is via checksums. The Adler32 algorithm in zlib, CRC32 in RAR, bsd(16) on unix machines etc etc. All these algorithms are fast formulas that process huge chunks of data to sum up all the bytes (contents) of the data one way or another. The resulting value is a number which later can be used to check the integrity of the data.
So yeah, Checksums
As per usual, I shall cite Wikipedia here:
A checksum or hash sum is a small-size datum from an arbitrary block of digital data for the purpose of detecting errors which may have been introduced during its transmission or storage. It is usually applied to an installation file after it is received from the download server. By themselves checksums are often used to verify data integrity, but should not be relied upon to also verify data authenticity.
Hash vs Checksum
I would like to make clear that I am discussing checksums here, 8-bit to 32-bit numeric values that represent the integrity of data in order to detect erroneous data. In my book, a hash is often also a checksum, but more than often it is not. Hashes are used to create absolute unique fingerprints of data often used for authentication. (Message Digest, Secure Hashing Algorithm or Tiger)
Checksum calculation in JavaScript
It can be quite difficult to do integer bitwise math in a language where numbers are double floating point numbers by default. Your code has to make sure that a number stays an integer, perhaps even stays unsigned and within a specific bit-range (16-bit, 32-bit). These extra steps might complicate things.
Tricks
A few tricks to ensure that a number is an x-bit integer, is by using the AND operator with a bitmask that allows all bits for that range. e.g. ensuring a 16-bit number could be done by number &= 0xffff;
. Furthermore I use operations such as num | 0
or num >>> 0
to ensure it is a 32-bit integer, signed or unsigned. This is required to prevent negative results to be generated, which is especially weird looking when you display the checksum in hexadecimal.
Speed
Obviously I did not try this to make a quick checksum engine, I merely wanted to test the possibility of bitwise calculation more after my SmallPRNG and ISAAC CPRNG pens (post pending ;)). The algorithms will perform differently in all browsers and browser versions and might even be extremely slow in some. I do believe Chrome will handle data very quickly and checksums can be calculated with a reasonable amount of speed!
Base class
I'm going to implement multiple checksum algorithms, so I will create a base class which will construct a state (ctx) for the chosen algorithm. This state will keep track of the checksum and an instance of the algorithm class. This class will also handle strings properly, taking their encoding into account.
This class also contains a test for Uint8Array support. I'm not sure if this is the best way to test for support, but it did the trick.
var hasTyped = (function() {
if(!('Uint8Array' in window)) {
return false;
}
try {
var a = new window.Uint8Array(10);
a[0] = 100;
if(a[0] === 100) {
return true;
}
return false;
} catch (e) {
return false;
}
}());
Check out the source of the base.
Algorithm classes
Now we can just add algorithms with Checksum.registerChecksum. Each Algorithm class should implement a single and an array method for processing data and optionally a setup method which will be called when the Checksum object is constructed.
BSD16
Here's a very simple one, only a little bit of code is required for this algorithm. The BSD Checksum algorithm!
;(function() {
'use strict';
if(typeof(window.Checksum) === "undefined") {
throw "The Checksum class is required for this code.";
}
/**
* Initialize anything you want in the constructor, such as setting up a lookup
* table of bytes. Make sure to cache your calculations, so they only have to be
* done once.
*/
var BSD16 = function BSD16() {
this.init = 0;
};
/**
* bsd16 for arrays, each value must be numeric and will be bound to 8-bits (Int8Array or Uint8Array works best!)
* @param {Array} a input (8-bit array)
* @param {Number} p previous checksum state
* @returns {Number} new checksum state
*/
BSD16.prototype.array = function(a, p) {
var c = p || 0, i = 0, l = a.length;
for(; i < l; i++) c = (((((c >>> 1) + ((c & 1) << 15)) | 0) + (a[i] & 0xff)) & 0xffff) | 0;
return c;
};
/**
* bsd16 for a single value, update a checksum with a new byte
* @param {Number} b byte (0-255)
* @param {Number} p previous checksum state
* @returns {Number} new checksum state
*/
BSD16.prototype.single = function(b, p) {
var c = p || 0;
return (((((c >>> 1) + ((c & 1) << 15)) | 0) + (b & 0xff)) & 0xffff) | 0;
};
Checksum.registerChecksum("bsd16", BSD16);
}());
FNV32 (FNV-0 and FNV-1)
Another easy one, the FNV Hash algorithm - which generates 32-bit checksums!
;(function() {
'use strict';
if(typeof(window.Checksum) === "undefined") {
throw "The Checksum class is required for this code.";
}
var prime = 0x01000193;
/**
* Initialize anything you want in the constructor, such as setting up a lookup
* table of bytes. Make sure to cache your calculations, so they only have to be
* done once.
*/
var FNV32 = function FNV32() {
this.init = 2166136261; // fnv-1!
};
/**
* The setup method which will be called when new Checksum("fletcher", ...) is called.
* This method is supposed to initialize the checksum cipher and to recieve parameters
* from the constructor.
*
* @param {Number} mode the FNV32 mode (FNV-1 (defualt) or FNV-0)
*/
FNV32.prototype.setup = function(mode) {
if(mode === 0) {
this.init = 0; // fnv-0.
}
};
FNV32.prototype.array = function(a, p) {
var len = a.length,
fnv = p || this.init;
for(var i = 0; i < len; i++) {
fnv = (fnv + (((fnv << 1) + (fnv << 4) + (fnv << 7) + (fnv << 8) + (fnv << 24)) >>> 0)) ^ (a[i] & 0xff);
}
return fnv >>> 0;
};
FNV32.prototype.single = function(b, p) {
var fnv = p || this.init;
return ((fnv + (((fnv << 1) + (fnv << 4) + (fnv << 7) + (fnv << 8) + (fnv << 24)) >>> 0)) ^ (b & 0xff)) >>> 0;
};
Checksum.registerChecksum("fnv32", FNV32);
}());
Implementation
Now we've registered a few checksum algorithms to the Checksum class, we can initialize a CTX and start processing data. This is fairly easy, and only requires a few lines of code.
var ctx = new Checksum("fnv32", 0); // fnv32-0
ctx.updateStringly("Hello World!!");
alert(ctx.result.toString(16));
Or in a single line of ugly code:
alert(new Checksum("fnv32", 0).updateStringly("Hello World!!").result.toString(16));
Example implementation
Here's a very simple example implementation that calculates a checksum of the text you type, as you type it. I added a few more algorithms to Checksum in this example. You can download the full source code and check out all scripts yourself.
Check out this pen.
That's it again - thanks for reading and I hope it was interesting and useful!