Skip to content

Commit

Permalink
dictFingerprint() fingerprinting made more robust.
Browse files Browse the repository at this point in the history
The previous hashing used the trivial algorithm of xoring the integers
together. This is not optimal as it is very likely that different
hash table setups will hash the same, for instance an hash table at the
start of the rehashing process, and at the end, will have the same
fingerprint.

Now we hash N integers in a smarter way, by summing every integer to the
previous hash, and taking the integer hashing again (see the code for
further details). This way it is a lot less likely that we get a
collision. Moreover this way of hashing explicitly protects from the
same set of integers in a different order to hash to the same number.

This commit is related to issue redis#1240.
  • Loading branch information
antirez committed Aug 19, 2013
1 parent 3039e80 commit 905d482
Showing 1 changed file with 29 additions and 9 deletions.
38 changes: 29 additions & 9 deletions src/dict.c
Original file line number Diff line number Diff line change
Expand Up @@ -512,15 +512,35 @@ void *dictFetchValue(dict *d, const void *key) {
* If the two fingerprints are different it means that the user of the iterator
* performed forbidden operations against the dictionary while iterating. */
long long dictFingerprint(dict *d) {
long long fingerprint = 0;

fingerprint ^= (long long) d->ht[0].table;
fingerprint ^= (long long) d->ht[0].size;
fingerprint ^= (long long) d->ht[0].used;
fingerprint ^= (long long) d->ht[1].table;
fingerprint ^= (long long) d->ht[1].size;
fingerprint ^= (long long) d->ht[1].used;
return fingerprint;
long long integers[6], hash = 0;
int j;

integers[0] = (long long) d->ht[0].table;
integers[1] = d->ht[0].size;
integers[2] = d->ht[0].used;
integers[3] = (long long) d->ht[1].table;
integers[4] = d->ht[1].size;
integers[5] = d->ht[1].used;

/* We hash N integers by summing every successive integer with the integer
* hashing of the previous sum. Basically:
*
* Result = hash(hash(hash(int1)+int2)+int3) ...
*
* This way the same set of integers in a different order will (likely) hash
* to a different number. */
for (j = 0; j < 6; j++) {
hash += integers[j];
/* For the hashing step we use Tomas Wang's 64 bit integer hash. */
hash = (~hash) + (hash << 21); // hash = (hash << 21) - hash - 1;
hash = hash ^ (hash >> 24);
hash = (hash + (hash << 3)) + (hash << 8); // hash * 265
hash = hash ^ (hash >> 14);
hash = (hash + (hash << 2)) + (hash << 4); // hash * 21
hash = hash ^ (hash >> 28);
hash = hash + (hash << 31);
}
return hash;
}

dictIterator *dictGetIterator(dict *d)
Expand Down

0 comments on commit 905d482

Please sign in to comment.