bpo-41486: zlib uses an UINT32_MAX sliding window for the output buffer · Pull Request #26143 · python/cpython
@gpshead
Sorry to bother you again.
Could you review this PR? This should be the last revision of blocks output buffer.
See this comment for details.
https://bugs.python.org/issue41486#msg393715
I had used half a month to polish this patch, mainly trying various code styles.
The code has been tested with this script:
import zlib, time, random _1G = 1024*1024*1024 UINT32_MAX = 0xFFFF_FFFF def test(DATA_SIZE, INIT_BUFF_SIZES): b = random.choice((b'a', b'b', b'c', b'd')) raw_dat = b * DATA_SIZE t1=time.perf_counter() compressed_dat = zlib.compress(raw_dat, 1) t2=time.perf_counter() del raw_dat print(f'compressed size: {len(compressed_dat)}, time: {t2-t1:.5f}') for init_size in INIT_BUFF_SIZES: t1=time.perf_counter() decompressed_dat = zlib.decompress(compressed_dat, bufsize=init_size) t2=time.perf_counter() assert len(decompressed_dat) == DATA_SIZE assert decompressed_dat.count(b) == DATA_SIZE del decompressed_dat print(f'data size: {DATA_SIZE:>8}, init buff size: {init_size:>8}, time: {t2-t1:.5f}') print() SIZE = _1G test(SIZE, (100, SIZE, SIZE+1, UINT32_MAX, UINT32_MAX+1)) SIZE = UINT32_MAX test(SIZE, (100, SIZE, SIZE+1, 2*UINT32_MAX+1)) SIZE = 10*_1G test(SIZE, (_1G, SIZE-1, SIZE, SIZE+1, UINT32_MAX, UINT32_MAX+1, 2*UINT32_MAX+1, 3*UINT32_MAX+1, 4*UINT32_MAX+1)) SIZE = 20*_1G test(SIZE, (3*_1G, SIZE-1, SIZE, SIZE+1, UINT32_MAX, UINT32_MAX+1, 2*UINT32_MAX+1, 3*UINT32_MAX+1, 4*UINT32_MAX+1, 5*UINT32_MAX+1, 6*UINT32_MAX+1))