Improve hashtable performance. by dbaarda · Pull Request #192 · librsync/librsync
added 6 commits
April 28, 2020 00:42Add nozero() static inline for avoiding zero hash keys, and make _KEY_HASH() use it. Simplify _for_probe() to take the hash and not do the nozero() now it's done by _KEY_HASH().
Analysis suggests at 80% load factor, 5 quadratic probes going a distance of 15 from the initally probed location are likely to spill over into the next 64 byte cache line. For a 70% load factor, <4 probes going a distance of <8 from the initial location are likely to be within the same cache line, significantly speeding the lookups. This also reduces the loading of the bloomfilter to below optimal for k=1, but modelling suggests using k=2 for the same size bloom filter would be adding another 50% chance L2 cache lookup and would be slower. Sticking with k=1 is not only simpler, but faster.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters