I wrote a script to monitor the memory when dumping 2GB of data with python master (C pickler and Python pickler):
```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.141s
=> peak memory usage: 4.014 GB
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 5.046s
=> peak memory usage: 5.955 GB
```
This is using protocol 4. Note that the C pickler is only making 1 useless memory copy instead of 2 for the Python pickler (one for the concatenation and the other because of the framing mechanism of protocol 4).
Here the output with the Python pickler fixed in python/cpython#4353:
```
(py37) ogrisel@ici:~/code/cpython$ python ~/tmp/large_pickle_dump.py --use-pypickle
Allocating source data...
=> peak memory usage: 2.014 GB
Dumping to disk...
done in 6.138s
=> peak memory usage: 2.014 GB
```
Basically the 2 spurious memory copies of the Python pickler with protocol 4 are gone.
Here is the script: https://gist.github.com/ogrisel/0e7b3282c84ae4a581f3b9ec1d84b45a |