'JMH Yields Inconsistent Results Between Forks

I am trying to benchmark the following JSON Java libraries to test their performance in each data type individually:

  • jackson 2.13.1
  • gson 2.9.0
  • dsl-json 1.9.9

I have used this issue as a reference for prepping benchmarking environment. I have:

  • turned off Turbo Boost
  • turned off hyperthreading
  • set a specific frequency

Environment specification:

Model: Dell XPS 13
Processor Name: i7-8565U
Number of Physical Cores: 4 (HT disabled)
Memory: 16 GB
Frequency: 1.7 GHz (dynamic scaling disabled)

JMH configuration:

# JMH version: 1.34
# VM version: JDK 11.0.14.1, OpenJDK 64-Bit Server VM, 11.0.14.1+1
# VM invoker: /usr/lib/jvm/jdk-11.0.14.1+1/bin/java
# VM options: -Xms2g -Xmx5g
# Blackhole mode: full + dont-inline hint (auto-detected, use -Djmh.blackhole.autoDetect=false to disable)
# Warmup: 5 iterations, 10 s each
# Measurement: 10 iterations, 3 s each
# Timeout: 10 min per iteration
# Threads: 4 threads, will synchronize iterations
# Benchmark mode: Throughput, ops/time

I benchmark each library with (randomly generated) payload of different sizes. (In this case, 1, 100 and 10000 in kilobytes.)

I see a run-to-run variance in the performance of some libraries. Some examples:

dsljson: deserialization of floats, 100kb:

# Fork: 1 of 2
# Warmup Iteration   1: 4029.774 ops/s
# Warmup Iteration   2: 3166.863 ops/s
# Warmup Iteration   3: 3159.217 ops/s
# Warmup Iteration   4: 3141.895 ops/s
# Warmup Iteration   5: 3133.101 ops/s
Iteration   1: 3167.981 ops/s
Iteration   2: 3125.062 ops/s
Iteration   3: 3117.676 ops/s
Iteration   4: 3111.584 ops/s
Iteration   5: 3166.126 ops/s
Iteration   6: 3115.660 ops/s
Iteration   7: 3174.774 ops/s
Iteration   8: 3145.705 ops/s
Iteration   9: 3114.003 ops/s
Iteration  10: 3110.038 ops/s

# Run progress: 30.00% complete, ETA 00:09:28
# Fork: 2 of 2
# Warmup Iteration   1: 4104.713 ops/s
# Warmup Iteration   2: 4835.940 ops/s
# Warmup Iteration   3: 4837.150 ops/s
# Warmup Iteration   4: 4843.004 ops/s
# Warmup Iteration   5: 4840.332 ops/s
Iteration   1: 4837.021 ops/s
Iteration   2: 4842.746 ops/s
Iteration   3: 4834.603 ops/s
Iteration   4: 4836.132 ops/s
Iteration   5: 4831.823 ops/s
Iteration   6: 4831.908 ops/s
Iteration   7: 4832.504 ops/s
Iteration   8: 4843.621 ops/s
Iteration   9: 4834.916 ops/s
Iteration  10: 4839.871 ops/s

dsljson: deserialization of booleans, 1kb:

# Fork: 1 of 2
# Warmup Iteration   1: 406004.866 ops/s
# Warmup Iteration   2: 271143.927 ops/s
# Warmup Iteration   3: 268366.029 ops/s
# Warmup Iteration   4: 274393.448 ops/s
# Warmup Iteration   5: 266508.634 ops/s
Iteration   1: 267538.158 ops/s
Iteration   2: 277851.756 ops/s
Iteration   3: 277347.214 ops/s
Iteration   4: 272775.976 ops/s
Iteration   5: 270001.148 ops/s
Iteration   6: 273953.473 ops/s
Iteration   7: 273338.470 ops/s
Iteration   8: 280372.852 ops/s
Iteration   9: 264382.761 ops/s
Iteration  10: 264932.107 ops/s

# Run progress: 10.00% complete, ETA 00:12:09
# Fork: 2 of 2
# Warmup Iteration   1: 958135.054 ops/s
# Warmup Iteration   2: 1031335.205 ops/s
# Warmup Iteration   3: 1034440.489 ops/s
# Warmup Iteration   4: 1035388.796 ops/s
# Warmup Iteration   5: 1034207.076 ops/s
Iteration   1: 1033830.589 ops/s
Iteration   2: 1031242.576 ops/s
Iteration   3: 1033097.444 ops/s
Iteration   4: 1034703.830 ops/s
Iteration   5: 1033415.839 ops/s
Iteration   6: 1033738.399 ops/s
Iteration   7: 1032749.533 ops/s
Iteration   8: 1030372.684 ops/s
Iteration   9: 1033219.458 ops/s
Iteration  10: 1034653.749 ops/s

For more examples, please refer to the repo.

Could someone please guide me through the steps which can be taken to investigate this behaviour of the benchmark?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source