Module hashers::pigeon[][src]

Hash functions by Steven Pigeon (https://hbfs.wordpress.com/)

From:

  • https://hbfs.wordpress.com/2015/09/29/hash-functions-part-i/
  • https://hbfs.wordpress.com/2015/10/06/the-anatomy-of-hash-functions-hash-functions-part-ii/
  • https://hbfs.wordpress.com/2015/10/13/testing-hash-functions-hash-functions-part-iii/
  • https://hbfs.wordpress.com/2015/10/20/three-bad-functions-hash-functions-part-iv/
  • https://hbfs.wordpress.com/2015/10/27/three-somewhat-better-functions-hash-functions-part-v/
  • https://hbfs.wordpress.com/2015/11/17/and-a-good-one-hash-functions-part-vi/

as well as

  • https://hbfs.wordpress.com/2011/11/08/mild-obfuscation/

In the previous entries, we learned that a good hash function for look-ups should disperse bits as much as possible as well as being unpredictable, that is, behave more or less like a pseudo-random number generator. We had a few failed attempts, a few promising ones, and now, a good one.

One of the weak operations in the previous hash functions is the combination operation, which is the addition. We remarked that it isn’t very good because it is unlikely to provoke a global change in the hash value. Indeed, if you add an 8-bit quantity, then you’re reasonably certain that the value changes for the first 8 bits, but after that, changes are operated only through the carry ripple, which has only probability \frac{1}{2}^i of being propagated to the ith bit. That is, it is very unlikely to ripple to the end of the word.

So we need an operation (as simple as possible) to make sure that the new bits are spread across, and affect, the hash value. Therefore, we must scatter input bits. But how? Well, we could design some bit-wise function that takes 8 bits and spread them, but that function would be fixed, and independent of the input bits (if we consider a permutation-type function). So we need a splatter that depends on the input, but covers more or less all bits. Well, we can do that by (integer) multiplying the next input byte by a large random-looking number. A random-looking prime number, in fact. Why prime? It will not introduce new common factors in the subsequent additions other than those in the input.

Let’s pick one:

This number is 58 bits long. If you multiply an 8-bit value by a 56-bits value, you’d get, most of the times, a 64-bits value. This time, it is a bit bigger to compensate the fact the the 8-bit input doesn’t necessarily use all of its 8 bits. Plus it’s prime. Why? How?

random-typing

(Yes. For real. That how I typed it. Not sorry.) Then let’s mix the product. Let’s use the perfect_shuffle we’ve already used. Then combine this value with a simple addition. The combination step being strong enough now, we could use a simple confusion step. Let’s use cut_deck, a function that swaps the high- and low part of the word, without exchanging bits in each parts, for a bit more confusion.

Unfortunately, although this looks like a good hash function, it is very slow, likely because it processes the input one byte at a time. If it were modified to correctly handle a larger block, it might actually be competitive.

Structs

Bricolage

Functions

bricolage

Provide access to Bricolage in a single call.