Message 306029 - Python tracker

Message306029

Author	serhiy.storchaka
Recipients	Olivier.Grisel, pitrou, serhiy.storchaka
Date	2017-11-10.13:13:27
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1510319607.77.0.213398074469.issue31993@psf.upfronthosting.co.za>
In-reply-to

Content
This speeds up pickling large bytes objects. $ ./python -m timeit -s 'import pickle; a = [bytes([i%256])1000000 for i in range(256)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)' Unpatched: 10 loops, best of 5: 20.7 msec per loop Patched: 200 loops, best of 5: 1.12 msec per loop But slows down pickling short bytes objects longer than 256 bytes (up to 40%). $ ./python -m timeit -s 'import pickle; a = [bytes([i%256])1000 for i in range(25600)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)' Unpatched: 5 loops, best of 5: 77.8 msec per loop Patched: 2 loops, best of 5: 98.5 msec per loop $ ./python -m timeit -s 'import pickle; a = [bytes([i%256])256 for i in range(100000)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)' Unpatched: 1 loop, best of 5: 278 msec per loop Patched: 1 loop, best of 5: 382 msec per loop Compare with: $ ./python -m timeit -s 'import pickle; a = [bytes([i%256])255 for i in range(100000)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)' Unpatched: 1 loop, best of 5: 277 msec per loop Patched: 1 loop, best of 5: 273 msec per loop I think the code should be optimized for decreasing an overhead of _write_many().

Content

This speeds up pickling large bytes objects.

$ ./python -m timeit -s 'import pickle; a = [bytes([i%256])*1000000 for i in range(256)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)'
Unpatched:  10 loops, best of 5: 20.7 msec per loop
Patched:    200 loops, best of 5: 1.12 msec per loop

But slows down pickling short bytes objects longer than 256 bytes (up to 40%).

$ ./python -m timeit -s 'import pickle; a = [bytes([i%256])*1000 for i in range(25600)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)'
Unpatched:  5 loops, best of 5: 77.8 msec per loop
Patched:    2 loops, best of 5: 98.5 msec per loop

$ ./python -m timeit -s 'import pickle; a = [bytes([i%256])*256 for i in range(100000)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)'
Unpatched:  1 loop, best of 5: 278 msec per loop
Patched:    1 loop, best of 5: 382 msec per loop

Compare with:

$ ./python -m timeit -s 'import pickle; a = [bytes([i%256])*255 for i in range(100000)]' 'with open("/dev/null", "wb") as f: pickle._dump(a, f)'
Unpatched:  1 loop, best of 5: 277 msec per loop
Patched:    1 loop, best of 5: 273 msec per loop

I think the code should be optimized for decreasing an overhead of _write_many().

History
Date	User	Action	Args
2017-11-10 13:13:27	serhiy.storchaka	set	recipients: + serhiy.storchaka, pitrou, Olivier.Grisel
2017-11-10 13:13:27	serhiy.storchaka	set	messageid: <1510319607.77.0.213398074469.issue31993@psf.upfronthosting.co.za>
2017-11-10 13:13:27	serhiy.storchaka	link	issue31993 messages
2017-11-10 13:13:27	serhiy.storchaka	create