[BEAM-13015] Update the SDK harness grouping table to be memory bounded based upon the amount of assigned cache memory and to use an LRU eviction policy. by lukecwik · Pull Request #17327 · apache/beam
I happened to do some benchmarking for a separate change (#17641) and noticed that this PR seems to reduce the performance significantly. Before (https://github.com/robertwb/incubator-beam/tree/java-combine-key-old) I was getting stats
33,102 ±(99.9%) 1,173 ops/s [Average]
(min, avg, max) = (32,761, 33,102, 33,492), stdev = 0,305
CI (99.9%): [31,929, 34,275] (assumes normal distribution)
24,809 ±(99.9%) 0,861 ops/s [Average]
(min, avg, max) = (24,521, 24,809, 25,083), stdev = 0,224
CI (99.9%): [23,948, 25,670] (assumes normal distribution)
(two benchmarks here: globally windowed and not) but after merging this change I'm seeing
Result "org.apache.beam.fn.harness.jmh.CombinerTableBenchmark.uniformDistribution":
4,949 ±(99.9%) 0,349 ops/s [Average]
(min, avg, max) = (4,832, 4,949, 5,059), stdev = 0,091
CI (99.9%): [4,601, 5,298] (assumes normal distribution)
Result "org.apache.beam.fn.harness.jmh.CombinerTableBenchmark.uniformDistribution":
3,855 ±(99.9%) 0,304 ops/s [Average]
(min, avg, max) = (3,735, 3,855, 3,930), stdev = 0,079
CI (99.9%): [3,551, 4,159] (assumes normal distribution)