Issue34128
Created on 2018-07-16 18:55 by Martin Bammer, last changed 2022-04-11 14:59 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| pickle_gil.patch | pitrou, 2018-07-17 13:48 | |||
| pickle_gil.py | pitrou, 2018-07-17 13:49 | |||
| Messages (12) | |||
|---|---|---|---|
| msg321755 - (view) | Author: Martin Bammer (Martin Bammer) | Date: 2018-07-16 18:55 | |
Hi, the old and slow python implementation of pickle didn't block background thread. But the newer C-implementation blocks other threads while dump/load is running. Wouldn't it be possible to allow other threads during this time? Especially could load/loads release the GIL, because Python objects are not available to the Python code until these functions have finished? Regards, Martin |
|||
| msg321764 - (view) | Author: (ppperry) | Date: 2018-07-16 19:58 | |
um, something doesn't make sense about this. the python implementation of pickle never released the GIL (it can't, by definition -- it's written in python). The C implementation releasing the GIL wouldn't make sense, as the pickle api involves calls into python everywhere (for example, `__reduce__`) |
|||
| msg321805 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 07:48 | |
This is about releasing the GIL periodically to allow other threads to run, as Python already does in its main interpreter loop. |
|||
| msg321806 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2018-07-17 08:00 | |
A workaround is writing Python wrappers for IO:
def Writer:
def __init__(self, file):
self.file = file
def write(self, data):
return self.file.write(data)
def Reader:
def __init__(self, file):
self.file = file
def read(self, size=-1):
return self.file.read(size)
def readline(self, size=-1):
return self.file.readline(size)
def peek(self, size=-1):
return self.file.peek(size)
def mydump(obj, file, *args, **kwargs):
return pickle.dump(obj, Writer(file), *args, **kwargs)
def myload(file, *args, **kwargs):
return pickle.load(Reader(file), *args, **kwargs)
|
|||
| msg321821 - (view) | Author: Martin Bammer (Martin Bammer) | Date: 2018-07-17 12:56 | |
Maybe an optional parameter with the desired interval would be good idea. So that the coder can decide if he wants/needs that feature and which interval he needs for his application. Otherwise it is hard to define a specific interval which fits for everyone. |
|||
| msg321823 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 13:21 | |
The right way to do this is not to pass a timeout parameter but to check for GIL interrupts as done in the main bytecode evaluation loop. |
|||
| msg321826 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 13:48 | |
Attaching proof-of-concept patch. |
|||
| msg321827 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 13:49 | |
Attaching demonstration script. |
|||
| msg321830 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 13:54 | |
(as the demo script shows, there is no detectable slowdown) |
|||
| msg321835 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2018-07-17 15:02 | |
The demo script shows around 8% slowdown to me for
data = list(map(float, range(N)))
|
|||
| msg321846 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2018-07-17 17:49 | |
Interesting, which kind of computer / system / compiler are you on? |
|||
| msg321847 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * ![]() |
Date: 2018-07-17 18:14 | |
CPU = Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz Ubuntu 18.04 Linux 4.15.0 x86_64 gcc 7.3.0 Performing the check in save() can have not insignificant overhead (especially after implementing the issue34141 optimization). It can be reduced if perform it when flush a frame (in protocol 4) or buffer to the file, or after writing significant amount of bytes into buffer. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:03 | admin | set | github: 78309 |
| 2019-05-10 17:46:26 | pitrou | set | nosy:
+ pierreglaser |
| 2018-07-17 18:14:10 | serhiy.storchaka | set | messages: + msg321847 |
| 2018-07-17 17:58:55 | ppperry | set | nosy:
- ppperry |
| 2018-07-17 17:49:47 | pitrou | set | messages: + msg321846 |
| 2018-07-17 15:02:49 | serhiy.storchaka | set | messages: + msg321835 |
| 2018-07-17 13:54:50 | pitrou | set | messages: + msg321830 |
| 2018-07-17 13:49:18 | pitrou | set | files:
+ pickle_gil.py messages: + msg321827 |
| 2018-07-17 13:48:46 | pitrou | set | files:
+ pickle_gil.patch keywords: + patch messages: + msg321826 |
| 2018-07-17 13:21:53 | pitrou | set | messages: + msg321823 |
| 2018-07-17 12:56:29 | Martin Bammer | set | messages: + msg321821 |
| 2018-07-17 12:42:51 | ppperry | set | components:
+ Library (Lib) title: Do not block threads when pickle/unpickle -> Release GIL periodically in _pickle module |
| 2018-07-17 08:00:53 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg321806 |
| 2018-07-17 07:48:53 | pitrou | set | nosy:
+ pitrou messages:
+ msg321805 |
| 2018-07-16 19:58:11 | ppperry | set | nosy:
+ ppperry messages: + msg321764 |
| 2018-07-16 18:55:37 | Martin Bammer | create | |
