Improve hashtable performance. by dbaarda · Pull Request #192 · librsync/librsync

added 6 commits

April 28, 2020 00:42
Add nozero() static inline for avoiding zero hash keys, and make _KEY_HASH()
use it. Simplify _for_probe() to take the hash and not do the nozero() now
it's done by _KEY_HASH().
Also rename the "bloom" attribute to "kbloom" to be more consistent with the
other attribute names.
Analysis suggests at 80% load factor, 5 quadratic probes going a distance of
15 from the initally probed location are likely to spill over into the next 64
byte cache line. For a 70% load factor, <4 probes going a distance of <8 from
the initial location are likely to be within the same cache line,
significantly speeding the lookups.

This also reduces the loading of the bloomfilter to below optimal for k=1, but
modelling suggests using k=2 for the same size bloom filter would be adding
another 50% chance L2 cache lookup and would be slower. Sticking with k=1 is
not only simpler, but faster.