Rework and publish metric benchmarks by jack-berg · Pull Request #8000 · open-telemetry/opentelemetry-java

As mentioned #7986, I've been working through some ideas to improve the performance of the metric SDK under high contention.

To illustrate the impact on these changes, I've reworked MetricsBenchmark to include dimensions that impact record performance. The set of dimensions that play some role include:

  • Instrument type / aggregation (5): counter + sum, up down counter + sum, gauge + last value, histogram + explicit histogram, histogram + base2 expo histogram
  • instrument value type (2): double, long
  • memory mode (2): immutable, reuseable
  • temporality (2): cumulative, delta
  • exemplars recorded (2): true, false
  • threads (2): 1, 4
  • cardinality (2): 1, 100

That forms 2 * 2 * 2 * 2 * 2 * 2 * 5 = 320 unique test cases, which is just impractical. And so I narrow it down to the most meaningful dimensions:

  • eliminated instrument value type: while long vs. double matters somewhat, its not much
  • eliminated memory mode: immutable vs reusable mostly matters for the collect path
  • exemplars: can impact performance, but less important than other factors

With these eliminated, were down to 222*5 = 40 test cases, which is more reasonable.

I'm also using this as an opportunity to finish what @tylerbenson started and get into the routine of running benchmarks on each change on dedicated hardwhere, and publishing the results on https://open-telemetry.github.io/opentelemetry-java/benchmarks/

The unfinished problem was that the benchmarks in this repo are micro benchmarks. Their not very meaningful for end users and may even do more harm then good. What we need is a curated set of somewhat high level benchmarks, intentionally built to demonstrate / report on the types of performance characteristics that matter to end users.

This revamped MetricRecordBenchmark is the first of these. I will followup with dedicated benchmarks for other areas:

  • Log SDK record and export
  • Trace SDK record and export
  • Metric SDK export
  • Noop implementation

For reference, here are the results of the revamped MetricRecordBenchmark on my machine:

Benchmark                       (aggregationTemporality)  (cardinality)  (instrumentTypeAndAggregation)   Mode  Cnt      Score     Error  Units
MetricRecordBenchmark.threads1                     DELTA              1                     COUNTER_SUM  thrpt    5  13414.208 ± 243.504  ops/s
MetricRecordBenchmark.threads1                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5  12276.148 ± 105.900  ops/s
MetricRecordBenchmark.threads1                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5  10896.580 ± 705.898  ops/s
MetricRecordBenchmark.threads1                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5   6642.787 ± 674.574  ops/s
MetricRecordBenchmark.threads1                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3651.887 ± 304.134  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                     COUNTER_SUM  thrpt    5   8359.025 ± 777.598  ops/s
MetricRecordBenchmark.threads1                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   9247.253 ± 423.551  ops/s
MetricRecordBenchmark.threads1                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   9165.700 ± 143.755  ops/s
MetricRecordBenchmark.threads1                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   7300.896 ± 684.395  ops/s
MetricRecordBenchmark.threads1                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3858.246 ±  34.989  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                     COUNTER_SUM  thrpt    5  12433.135 ± 148.315  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5  13341.423 ± 242.611  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5  10628.592 ± 101.145  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   6895.783 ± 740.681  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   4087.396 ± 435.895  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                     COUNTER_SUM  thrpt    5  10402.076 ± 240.933  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   9199.368 ± 107.627  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   9056.580 ± 297.773  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   7475.743 ± 979.090  ops/s
MetricRecordBenchmark.threads1                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   3836.227 ± 131.765  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                     COUNTER_SUM  thrpt    5   1577.822 ± 219.796  ops/s
MetricRecordBenchmark.threads4                     DELTA              1             UP_DOWN_COUNTER_SUM  thrpt    5   1615.582 ± 335.284  ops/s
MetricRecordBenchmark.threads4                     DELTA              1                GAUGE_LAST_VALUE  thrpt    5   1208.008 ± 165.999  ops/s
MetricRecordBenchmark.threads4                     DELTA              1              HISTOGRAM_EXPLICIT  thrpt    5    904.243 ±  22.615  ops/s
MetricRecordBenchmark.threads4                     DELTA              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    869.229 ±  31.214  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                     COUNTER_SUM  thrpt    5   1725.486 ± 240.360  ops/s
MetricRecordBenchmark.threads4                     DELTA            100             UP_DOWN_COUNTER_SUM  thrpt    5   1422.319 ± 594.337  ops/s
MetricRecordBenchmark.threads4                     DELTA            100                GAUGE_LAST_VALUE  thrpt    5   1560.890 ± 654.561  ops/s
MetricRecordBenchmark.threads4                     DELTA            100              HISTOGRAM_EXPLICIT  thrpt    5   1587.582 ± 458.715  ops/s
MetricRecordBenchmark.threads4                     DELTA            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1688.229 ± 181.653  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                     COUNTER_SUM  thrpt    5   1540.747 ± 137.303  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1             UP_DOWN_COUNTER_SUM  thrpt    5   1429.698 ± 220.415  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1                GAUGE_LAST_VALUE  thrpt    5   1215.367 ± 546.045  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1              HISTOGRAM_EXPLICIT  thrpt    5   1237.215 ±  18.528  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE              1     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5    837.980 ±  23.871  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                     COUNTER_SUM  thrpt    5   1602.628 ± 813.536  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100             UP_DOWN_COUNTER_SUM  thrpt    5   1717.663 ± 577.817  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100                GAUGE_LAST_VALUE  thrpt    5   1565.824 ± 298.550  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100              HISTOGRAM_EXPLICIT  thrpt    5   1352.174 ± 594.439  ops/s
MetricRecordBenchmark.threads4                CUMULATIVE            100     HISTOGRAM_BASE2_EXPONENTIAL  thrpt    5   1465.394 ± 313.072  ops/s