Add additional gc benchmark with pickletools (#437) by pgdr · Pull Request #438 · python/pyperformance
Adds a benchmark reproducing the Python 3.14 garbage collector regression described in cpython/#140175.
This real-world case uses pickletools to demonstrate the performance issue.
Fixes #437.
| Python version | Running time (sec) |
|---|---|
| 3.13 | 1.59 |
| 3.14 | 6.47 |
| 3.15a | 1.55 |
These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.
I could reduce the size of the instance to lower the overall running time, but it seems like the garbage collector bug doesn't "kick in" until we reach a certain size.
With N = 100'000, the slowdown is not as noticable:
| Python version | Running time (ms) |
|---|---|
| 3.13 | 162 |
| 3.14 | 197 |
| 3.15a | 154 |
@sergey-miryanov Thanks for the review. I have fixed all issues you pointed out.
@sergey-miryanov Something strange happens here. Even though I use the context manager (tempfile.TemporaryDirectory), occasionally when I kill pyperformance, the directory remains not cleaned up.
I am not able to reproduce this behavior when not running with pyperf, though, so it might be related to the way pyperf sets up (parallel?) runners.
It sounds like a bug, but I can't tell where.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me.
@pgdr Thanks! It is up to pyperformance maintainers now.
These tests (and the PR) has N = 1'000'000. The downside is that running the benchmark (with Python 3.14) takes almost 10 minutes.
Taking 10 minutes would be too long. However, it only takes about 6 seconds for me to run this on Python 3.14.0, on my hardware. Perhaps the 10 minutes is for when N = 10e6? The regression I see from 3.13 to 3.14 with N = 1e6 seems large enough (1.5 seconds vs 6 seconds, roughly).
Nice work on this benchmark. I think it's good because optimize() is doing some meaningful work, unlike some other synthetic benchmarks. In addition to showing this regression in the GC, I would expect this benchmark to catch other kinds of performance regressions.
Small suggestion: it would be simpler to use io.BytesIO() rather than using real files in a temporary folder. I don't think that affects the usefulness of the benchmark, since we are not really testing real file IO speed. Something like this:
def setup(fp, N):
x = {}
for i in range(1, N):
x[i] = f"ii{i:>07}"
pickle.dump(x, fp, protocol=4)
def run(fp):
p = fp.read()
s = pickletools.optimize(p)
You could use dumps() as well and do away with the file.
@nascheme Thanks a lot, that saved a whole bunch of complexity. Running some tests and then I'll fix it. Something like this:
import pickle import pickletools import pyperf def setup(N: int) -> bytes: x = {i: f"ii{i:>07}" for i in range(N)} return pickle.dumps(x, protocol=4) def run(p: bytes) -> None: pickletools.optimize(p) if __name__ == "__main__": runner = pyperf.Runner() runner.metadata["description"] = "Pickletools optimize" N = 100_000 payload = setup(N) runner.bench_func("pickle_opt", run, payload)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters