// Overview / Examples / API / FAQ / Resources
Use cases
A static perfect hash function maps a set of keys known in advance to a set of values with no collisions
Features
- Single header (https://raw.githubusercontent.com/qlibs/mph/main/mph) / C++20 module (https://raw.githubusercontent.com/qlibs/mph/main/mph.cppm)
- Self verification upon include (can be disabled by
-DNTEST- see FAQ) - Compiles cleanly with (
-Wall -Wextra -Werror -pedantic -pedantic-errors -fno-exceptions -fno-rtti) - Minimal API
- Optimized run-time execution (see performance / benchmarks)
- Fast compilation times (see compilation)
- Trade-offs (see FAQ)
Requirements
- C++20 (gcc-12+, clang-15+) / [optional] (bmi2), [optional] (simd),
Overview
Hello world (https://godbolt.org/z/dzd6o3Pxo)
enum class color { red, green, blue }; constexpr auto colors = std::array{ std::pair{"red"sv, color::red}, std::pair{"green"sv, color::green}, std::pair{"blue"sv, color::blue}, }; static_assert(color::green == mph::lookup<colors>("green")); static_assert(color::red == mph::lookup<colors>("red")); static_assert(color::blue == mph::lookup<colors>("blue")); std::print("{}", mph::lookup<colors>("green"sv)); // prints 1
Note
mph::lookup assumes only valid input and returns mapped value direclty.
static_assert(not mph::find<colors>("unknown")); static_assert(mph::find<colors>("green")); static_assert(mph::find<colors>("red")); static_assert(mph::find<colors>("blue")); std::print("{}", *mph::find<colors>("green"sv)); // prints 1
Note
mph::find doesnt assume valid input and returns optional of mapped value.
[Minimal] Lookup (https://godbolt.org/z/rqYj9a1cr)
int lookup(int id) { static constexpr std::array ids{ std::pair{54u, 91u}, std::pair{64u, 324u}, std::pair{91u, 234u}, }; return mph::lookup<ids>(id); }
lookup: // g++ -DNDEBUG -std=c++20 -O3 imull $1275516394, %edi, %eax shrl $23, %eax movl $24029728, %ecx shrxl %eax, %ecx, %eax andl $511, %eax retq
Lookup (https://godbolt.org/z/vv6W4nGfb)
int lookup(int id) { static constexpr std::array ids{ std::pair{54u, 91u}, std::pair{324u, 54u}, std::pair{64u, 324u}, std::pair{234u, 64u}, std::pair{91u, 234u}, }; return mph::lookup<ids>(id); }
lookup: // g++ -DNDEBUG -std=c++20 -O3 andl $7, %edi leaq lookup(%rip), %rax movl (%rax,%rdi,4), %eax retq lookup: .long 324 .long 0 .long 64 .long 234 .long 54 .long 0 .long 91
int find(int id) { static constexpr std::array ids{ std::pair{27629, 1}, std::pair{6280, 2}, // 1..128 pairs... std::pair{33691, 128}, }; return *mph::find<ids>(id); }
find: // g++ -DNDEBUG -std=c++20 -O3 -mbmi2 -mavx512f vpbroadcastd %edi, %zmm0 shll $4, %edi movzbl %dil, %ecx leaq find vpcmpeqd (%rdx,%rcx,4), %zmm0, %k0 kmovw %k0, %esi kortestw %k0, %k0 rep bsfq %rax, %rax movl $64, %eax addl %eax, %ecx xorl %eax, %eax testw %si, %si cmovnel 1024(%rdx,%rcx,4), %eax vzeroupper retq find: ... // see godbolt
Find Strings (https://godbolt.org/z/KaKzf7Pax)
int find(std::span<const char, 8> str) { static constexpr auto symbols = std::array{ std::pair{"AMZN "sv, 1}, std::pair{"AAPL "sv, 2}, std::pair{"GOOGL "sv, 3}, std::pair{"META "sv, 4}, std::pair{"MSFT "sv, 5}, std::pair{"NVDA "sv, 6}, std::pair{"TSLA "sv, 7}, }; return *mph::find<symbols>(str); }
find: // g++ -DNDEBUG -std=c++20 -O3 -mbmi2 movq 8(%rsi), %rax movl $1031, %ecx leaq find(%rip), %rdx xorl %esi, %esi movq (%rax), %rax pextq %rcx, %rax, %rcx shll $4, %ecx cmpq (%rcx,%rdx), %rax movzbl 8(%rcx,%rdx), %eax cmovnel %esi, %eax retq find: ... // see godbolt
Find Strings (https://godbolt.org/z/fdMPsYWjE)
int find(std::string_view str) { using std::literals::operator""sv; // values assigned from 0..N-1 static constexpr std::array symbols{ "BTC "sv, "ETH "sv, "BNB "sv, "SOL "sv, "XRP "sv, "DOGE"sv, "TON "sv, "ADA "sv, "SHIB"sv, "AVAX"sv, "LINK"sv, "BCH "sv, }; return *mph::find<symbols>(str); }
find: // g++ -DNDEBUG -std=c++20 -O3 -mbmi2 shll $3, %edi bzhil %edi, (%rsi), %eax movl $789, %ecx pextl %ecx, %eax, %ecx leaq find(%rip), %rdx xorl %esi, %esi cmpl (%rdx,%rcx,8), %eax movzbl 4(%rdx,%rcx,8), %eax cmovnel %esi, %eax retq find: ... // see godbolt
Examples
- [feature]
lookup/findcustomization point - https://godbolt.org/z/enqeGxKK9 - [feature]
tocustomization point - https://godbolt.org/z/jTMx4n6j3 - [example]
branchless dispatcher- https://godbolt.org/z/5PTE3ercE - [performance - https://wg21.link/P2996]
enum_to_string- https://godbolt.org/z/ojohP6j7f - [performance - https://wg21.link/P2996]
string_to_enum- https://godbolt.org/z/83vGhY7M8
Benchmarks (run-time)
clang++ -std=c++20 -O3 -DNDEBUG -mbmi2 benchmark.cpp
| ns/op | op/s | err% |total | benchmark
|------:|---------------:|-----:|-----:|:----------
| 12.25 | 81,602,449.70 | 0.3% | 0.15 | `random_strings_5_len_4.std.map`
| 5.56 | 179,750,906.50 | 0.2% | 0.07 | `random_strings_5_len_4.std.unordered_map`
| 9.17 | 109,096,850.98 | 0.2% | 0.11 | `random_strings_5_len_4.boost.unordered_map`
| 13.48 | 74,210,250.54 | 0.3% | 0.16 | `random_strings_5_len_4.boost.flat_map`
| 7.70 | 129,942,965.18 | 0.3% | 0.09 | `random_strings_5_len_4.gperf`
| 1.61 | 621,532,188.81 | 0.1% | 0.02 | `random_strings_5_len_4.mph`
| 14.66 | 68,218,086.71 | 0.8% | 0.18 | `random_strings_5_len_8.std.map`
| 13.45 | 74,365,239.56 | 0.2% | 0.16 | `random_strings_5_len_8.std.unordered_map`
| 9.68 | 103,355,605.09 | 0.2% | 0.12 | `random_strings_5_len_8.boost.unordered_map`
| 16.00 | 62,517,180.19 | 0.4% | 0.19 | `random_strings_5_len_8.boost.flat_map`
| 7.70 | 129,809,356.36 | 0.3% | 0.09 | `random_strings_5_len_8.gperf`
| 1.58 | 633,084,194.24 | 0.1% | 0.02 | `random_strings_5_len_8.mph`
| 17.21 | 58,109,576.87 | 0.3% | 0.21 | `random_strings_6_len_2_5.std.map`
| 15.28 | 65,461,167.99 | 0.2% | 0.18 | `random_strings_6_len_2_5.std.unordered_map`
| 12.21 | 81,931,391.20 | 0.4% | 0.15 | `random_strings_6_len_2_5.boost.unordered_map`
| 17.15 | 58,323,741.08 | 0.5% | 0.21 | `random_strings_6_len_2_5.boost.flat_map`
| 7.94 | 125,883,197.55 | 0.5% | 0.09 | `random_strings_6_len_2_5.gperf`
| 6.05 | 165,239,616.00 | 0.5% | 0.07 | `random_strings_6_len_2_5.mph`
| 31.61 | 31,631,402.94 | 0.2% | 0.38 | `random_strings_100_len_8.std.map`
| 15.32 | 65,280,594.09 | 0.2% | 0.18 | `random_strings_100_len_8.std.unordered_map`
| 17.13 | 58,383,850.20 | 0.3% | 0.20 | `random_strings_100_len_8.boost.unordered_map`
| 31.42 | 31,822,519.67 | 0.2% | 0.38 | `random_strings_100_len_8.boost.flat_map`
| 8.04 | 124,397,773.85 | 0.2% | 0.10 | `random_strings_100_len_8.gperf`
| 1.58 | 632,813,481.73 | 0.1% | 0.02 | `random_strings_100_len_8.mph`
| 32.62 | 30,656,015.03 | 0.3% | 0.39 | `random_strings_100_len_1_8.std.map`
| 19.34 | 51,697,107.73 | 0.5% | 0.23 | `random_strings_100_len_1_8.std.unordered_map`
| 19.51 | 51,254,525.17 | 0.3% | 0.23 | `random_strings_100_len_1_8.boost.unordered_map`
| 33.58 | 29,780,574.17 | 0.6% | 0.40 | `random_strings_100_len_1_8.boost.flat_map`
| 13.06 | 76,577,037.07 | 0.7% | 0.16 | `random_strings_100_len_1_8.gperf`
| 6.02 | 166,100,665.07 | 0.2% | 0.07 | `random_strings_100_len_1_8.mph`
| 1.28 | 778,723,795.75 | 0.1% | 0.02 | `random_uints_5.mph`
g++ -std=c++20 -O3 -DNDEBUG -mbmi2 benchmark.cpp
| ns/op | op/s | err% |total | benchmark
|------:|---------------:|-----:|-----:|:----------
| 12.28 | 81,460,330.38 | 0.9% | 0.15 | `random_strings_5_len_4.std.map`
| 5.29 | 188,967,241.90 | 0.3% | 0.06 | `random_strings_5_len_4.std.unordered_map`
| 9.69 | 103,163,192.67 | 0.2% | 0.12 | `random_strings_5_len_4.boost.unordered_map`
| 13.56 | 73,756,333.08 | 0.4% | 0.16 | `random_strings_5_len_4.boost.flat_map`
| 7.69 | 130,055,662.66 | 0.6% | 0.09 | `random_strings_5_len_4.gperf`
| 1.39 | 718,910,252.82 | 0.1% | 0.02 | `random_strings_5_len_4.mph`
| 14.26 | 70,103,007.82 | 2.4% | 0.17 | `random_strings_5_len_8.std.map`
| 13.36 | 74,871,047.51 | 0.4% | 0.16 | `random_strings_5_len_8.std.unordered_map`
| 9.82 | 101,802,074.00 | 0.3% | 0.12 | `random_strings_5_len_8.boost.unordered_map`
| 15.97 | 62,621,571.95 | 0.3% | 0.19 | `random_strings_5_len_8.boost.flat_map`
| 7.92 | 126,265,206.30 | 0.3% | 0.09 | `random_strings_5_len_8.gperf`
| 1.40 | 713,596,376.62 | 0.4% | 0.02 | `random_strings_5_len_8.mph`
| 15.98 | 62,576,142.34 | 0.5% | 0.19 | `random_strings_6_len_2_5.std.map`
| 17.56 | 56,957,868.12 | 0.5% | 0.21 | `random_strings_6_len_2_5.std.unordered_map`
| 11.68 | 85,637,378.45 | 0.3% | 0.14 | `random_strings_6_len_2_5.boost.unordered_map`
| 17.25 | 57,965,732.68 | 0.6% | 0.21 | `random_strings_6_len_2_5.boost.flat_map`
| 9.13 | 109,580,632.48 | 0.7% | 0.11 | `random_strings_6_len_2_5.gperf`
| 7.17 | 139,563,745.72 | 0.4% | 0.09 | `random_strings_6_len_2_5.mph`
| 30.20 | 33,117,522.76 | 0.7% | 0.36 | `random_strings_100_len_8.std.map`
| 15.01 | 66,627,962.89 | 0.4% | 0.18 | `random_strings_100_len_8.std.unordered_map`
| 16.79 | 59,559,414.60 | 0.6% | 0.20 | `random_strings_100_len_8.boost.unordered_map`
| 31.36 | 31,884,629.57 | 0.8% | 0.38 | `random_strings_100_len_8.boost.flat_map`
| 7.75 | 128,973,947.61 | 0.7% | 0.09 | `random_strings_100_len_8.gperf`
| 1.50 | 667,041,673.54 | 0.1% | 0.02 | `random_strings_100_len_8.mph`
| 30.92 | 32,340,612.08 | 0.4% | 0.37 | `random_strings_100_len_1_8.std.map`
| 25.35 | 39,450,222.09 | 0.4% | 0.30 | `random_strings_100_len_1_8.std.unordered_map`
| 19.76 | 50,609,820.90 | 0.2% | 0.24 | `random_strings_100_len_1_8.boost.unordered_map`
| 32.39 | 30,878,018.77 | 0.6% | 0.39 | `random_strings_100_len_1_8.boost.flat_map`
| 11.20 | 89,270,687.92 | 0.2% | 0.13 | `random_strings_100_len_1_8.gperf`
| 7.17 | 139,471,159.67 | 0.5% | 0.09 | `random_strings_100_len_1_8.mph`
| 1.93 | 519,047,110.39 | 0.3% | 0.02 | `random_uints_5.mph`
Benchmarks (compilation-time)
API
namespace mph::inline v5_0_5 { /** * Static [minimal] perfect hash lookup function * @tparam entries constexpr array of keys or key/value pairs */ template<const auto& entries> inline constexpr auto lookup = [](const auto& key) { if constexpr(constexpr lookup$magic_lut<entries> lookup{}; lookup) { return lookup(key); } else { return lookup$pext<entries>(key); } }; /** * Static perfect hash find function * @tparam entries constexpr array of keys or key/value pairs */ template<const auto& entries> inline constexpr auto find = []<u8 probability = 50u>(const auto& key, const auto& unknown = {}) -> optional { if constexpr (entries.size() == 0u) { return unknown; } else if constexpr (entries.size() <= 64u) { return find$pext<entries>.operator()<probability>(key, unknown); } else { constexpr auto bucket_size = simd_size_v<key_type, simd_abi::native<key_type>>; return find$simd<entries, bucket_size>.operator()<probability>(key, unknown); } }; } // namespace mph
FAQ
Trade-offs?
mphsupports different types of key/value pairs and thousands of key/value pairs, but not millions - (see benchmarks).
All keys have to fit into
uint128_t, that includes strings.If the above criteria are not satisfied
mphwill SFINAE awaylookupfunction.In such case different backup policy should be used instead (which can be also used as customization point for user-defined
lookupimplementation), for example:template<const auto& entries> requires (entries.size() > 1'000'000) inline constexpr auto mph::find = [](const auto& key, const auto& unknown = {}) -> optional { ... }How
mphis working under the hood?
mphtakes advantage of knowing the key/value pairs at compile-time as well as the specific hardware instructions. The following is a pseudo code of thelookupalgorithm for minimal perfect hash table.def lookup$magic_lut[entries: array](key : any, max_attempts = 100'000): # 0. magic and lut for entries [compile-time] nbits = sizeof(u32) * CHAR_BIT - countl_zero(max(entries.second)) mask = (1u << nbits) - 1u; shift = sizeof(u32) * CHAR_BIT - nbits; lut = {}; while max_attempts--: magic = rand() for k, v in entries: lut |= v << (k * magic >> shift); for k, v in entries: if (lut >> (k * magic >> shift) & mask) != v: lut = {} break assert magic != 0 and lut != 0 and shift != 0 and mask != 0 # 1. lookup [run-time] return (lut >> ((key * magic) >> shift)) & mask;The following is a pseudo code of the
findalgorithm for perfect hash table.# word: 00101011 # mask: 11100001 # &: 000____1 # pext: ____0001 # intel/intrinsics-guide/index.html#text=pext def pext(a : uN, mask : uN): dst, m, k = ([], 0, 0) while m < nbits(a): if mask[m] == 1: dst.append(a[m]) k += 1 m += 1 return uN(dst)def find$pext[entries: array](key : any, unknown: any): # 0. find mask which uniquely identifies all keys [compile-time] mask = 0b111111... for i in range(nbits(mask)): masked = [] mask.unset(i) for k, v in entries: masked.append(k & mask) if not unique(masked): mask.set(i) assert unique(masked) assert mask != ~mask{} # 1. create lookup table [compile-time] lookup = array(typeof(entries[0]), 2**popcount(mask)) for k, v in entries: lookup[pext(k, mask)] = (k, v) # 2. lookup [run-time] # if key is a string convert to integral first (memcpy) k, v = lookup[pext(key, mask)] if k == key: # cmove return v else: return unknowndef find$simd[entries: array](key : any, unknown: any): # 0. find mask which uniquely identifies all keys [compile-time] mask = 0b111111... bucket_size = simd_size_v<entries[0].first, native> for i in range(nbits(mask)): masked = [] mask.unset(i) for k, v in entries: masked.append(k & mask) if not unique(masked, bucket_size): mask.set(i) assert unique(masked, bucket_size) assert mask != ~mask{} # 1. create lookup table [compile-time] keys = array(typeof(entries[0].first), bucket_size * 2**popcount(mask)) values = array(typeof(entries[0].second), bucket_size * 2**popcount(mask)) for k, v in entries: slot = pext(k, mask) while (keys[slot]) slot++; keys[slot] = k values[slot] = v # 2. lookup [run-time] # if key is a string convert to integral first (memcpy) index = bucket_size * pext(key, mask) match = k == keys[&index] # simd element-wise comparison if any_of(match): return values[index + find_first_set(match)] else: return unknownHow to tweak
lookup/findperformance for my data/use case?Always measure!
- [bmi2 (Intel Haswell+, AMD Zen3+)] hardware instruction acceleration is faster than software emulation. (AMD Zen2 pext takes 18 cycles, is worth disabling hardware accelerated version)
- For integral keys, use u32 or u64.
- For strings, consider aligning the input data and passing it with compile-time size via
span,array.- If all strings length is less than 4 that will be more optimized than if all string length will be less than 8 and 16. That will make the lookup table smaller and getting the value will have one instruction less.
- Experiment with different
probabilityvalues to optimize lookups. Especially benefitial if its known that input keys are always coming from predefinedentries(probability = 100) as it will avoid the comparison.- Consider passing cache size alignment (
hardware_destructive_interference_size- usually64u) to thelookup/find. That will align the underlying lookup table.How to fix compilation error
constexpr evaluation hit maximum step limit?The following options can be used to increase the limits, however, compilation-times should be monitored.
gcc: -fconstexpr-ops-limit=N clang: -fconstexpr-steps=NIs support for bmi2 instructions required?
mphworks on platforms withoutbmi2instructions which can be emulated with some limitations (*).// bmi2 mov ecx, 789 pext ecx, eax, ecx// no bmi2 mov ecx, eax and ecx, 789 imul ecx, ecx, 57 shr ecx, 2 and ecx, 248https://stackoverflow.com/questions/14547087/extracting-bits-with-a-single-multiplication (*)
How to disable
cmovgeneration?Set
probabilityvalue to something else than50u(default) - it means that the input data is predictable in some way andjmpwill be generated instead. Additionaly the following compiler options can be used.clang: -mllvm -x86-cmov-converter=falseHow to disable running tests at compile-time?
When
-DNTESTis defined static_asserts tests wont be executed upon inclusion. Note: Use with caution as disabling tests means that there are no gurantees upon inclusion that given compiler/env combination works as expected.Similar projects?
gperf, frozen, nbperf, cmph, perfecthash, lemonhash, pthash, shockhash, burr, hash-prospector
Resources
- Minimal Perfect Hashing - http://stevehanov.ca/blog/index.php?id=119
- Pefect Hashing - https://github.com/tpn/pdfs/tree/master/Perfect%20Hashing
- Hash Functions - https://nullprogram.com/blog/2018/07/31
gperf- https://www.dre.vanderbilt.edu/~schmidt/PDF/C++-USENIX-90.pdfcmph- https://cmph.sourceforge.net/paperspthash- https://arxiv.org/pdf/2104.10402smasher- https://github.com/rurban/smhasher

