SipHash 2-4 -> SipHash 1-2.

For performance reasons we use a reduced rounds variant of
SipHash. This should still provide enough protection and the
effects in the hash table distribution are non existing.
If some real world attack on SipHash 1-2 will be found we can
trivially switch to something more secure. Anyway it is a
big step forward from Murmurhash, for which it is trivial to
generate *seed independent* colliding keys... The speed
penatly introduced by SipHash 2-4, around 4%, was a too big
price to pay compared to the effectiveness of the HashDoS
attack against SipHash 1-2, and considering so far in the
Redis history, no such an incident ever happened even while
using trivially to collide hash functions.
This commit is contained in:
antirez 2017-02-21 17:07:28 +01:00
parent 84fa8230e5
commit 0285c2714b

View File

@ -19,18 +19,23 @@
This version was modified by Salvatore Sanfilippo <antirez@gmail.com> This version was modified by Salvatore Sanfilippo <antirez@gmail.com>
in the following ways: in the following ways:
1. Hard-code 2-4 rounds in the hope the compiler can optimize it more 1. We use SipHash 1-2. This is not believed to be as strong as the
suggested 2-4 variant, but AFAIK there are not trivial attacks
against this reduced-rounds version, and it runs at the same speed
as Murmurhash2 that we used previously, why the 2-4 variant slowed
down Redis by a 4% figure more or less.
2. Hard-code rounds in the hope the compiler can optimize it more
in this raw from. Anyway we always want the standard 2-4 variant. in this raw from. Anyway we always want the standard 2-4 variant.
2. Modify the prototype and implementation so that the function directly 3. Modify the prototype and implementation so that the function directly
returns an uint64_t value, the hash itself, instead of receiving an returns an uint64_t value, the hash itself, instead of receiving an
output buffer. This also means that the output size is set to 8 bytes output buffer. This also means that the output size is set to 8 bytes
and the 16 bytes output code handling was removed. and the 16 bytes output code handling was removed.
3. Provide a case insensitive variant to be used when hashing strings that 4. Provide a case insensitive variant to be used when hashing strings that
must be considered identical by the hash table regardless of the case. must be considered identical by the hash table regardless of the case.
If we don't have directly a case insensitive hash function, we need to If we don't have directly a case insensitive hash function, we need to
perform a text transformation in some temporary buffer, which is costly. perform a text transformation in some temporary buffer, which is costly.
4. Remove debugging code. 5. Remove debugging code.
5. Modified the original test.c file to be a stand-alone function testing 6. Modified the original test.c file to be a stand-alone function testing
the function in the new form (returing an uint64_t) using just the the function in the new form (returing an uint64_t) using just the
relevant test vector. relevant test vector.
*/ */
@ -131,7 +136,6 @@ uint64_t siphash(const uint8_t *in, const size_t inlen, const uint8_t *k) {
m = U8TO64_LE(in); m = U8TO64_LE(in);
v3 ^= m; v3 ^= m;
SIPROUND;
SIPROUND; SIPROUND;
v0 ^= m; v0 ^= m;
@ -150,14 +154,11 @@ uint64_t siphash(const uint8_t *in, const size_t inlen, const uint8_t *k) {
v3 ^= b; v3 ^= b;
SIPROUND;
SIPROUND; SIPROUND;
v0 ^= b; v0 ^= b;
v2 ^= 0xff; v2 ^= 0xff;
SIPROUND;
SIPROUND;
SIPROUND; SIPROUND;
SIPROUND; SIPROUND;
@ -195,7 +196,6 @@ uint64_t siphash_nocase(const uint8_t *in, const size_t inlen, const uint8_t *k)
m = U8TO64_LE_NOCASE(in); m = U8TO64_LE_NOCASE(in);
v3 ^= m; v3 ^= m;
SIPROUND;
SIPROUND; SIPROUND;
v0 ^= m; v0 ^= m;
@ -214,14 +214,11 @@ uint64_t siphash_nocase(const uint8_t *in, const size_t inlen, const uint8_t *k)
v3 ^= b; v3 ^= b;
SIPROUND;
SIPROUND; SIPROUND;
v0 ^= b; v0 ^= b;
v2 ^= 0xff; v2 ^= 0xff;
SIPROUND;
SIPROUND;
SIPROUND; SIPROUND;
SIPROUND; SIPROUND;
@ -308,7 +305,11 @@ const uint8_t vectors_sip64[64][8] = {
/* Test siphash using a test vector. Returns 0 if the function passed /* Test siphash using a test vector. Returns 0 if the function passed
* all the tests, otherwise 1 is returned. */ * all the tests, otherwise 1 is returned.
*
* IMPORTANT: The test vector is for SipHash 2-4. Before running
* the test revert back the siphash() function to 2-4 rounds since
* now it uses 1-2 rounds. */
int siphash_test(void) { int siphash_test(void) {
uint8_t in[64], k[16]; uint8_t in[64], k[16];
int i; int i;