[Python-Dev] Unpickling memory usage problem, and a proposed solution

Dan Gindikin dgindikin at gmail.com
Fri Apr 23 23:11:34 CEST 2010

Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Alexandre Vassalotti <alexandre <at> peadrop.com> writes:

> 
> On Fri, Apr 23, 2010 at 3:57 PM, Dan Gindikin <dgindikin <at> gmail.com> wrote:
> > This wouldn't help our use case, your code needs the entire pickle
> > stream to be in memory, which in our case would be about 475mb, this
> > is on top of the 300mb+ data structures that generated the pickle
> > stream.
> >
> 
> In that case, the best we could do is a two-pass algorithm to remove
> the unused PUTs. That won't be efficient, but it will satisfy the
> memory constraint.

That is for what I'm doing for us right now.

> Another solution is to not generate the PUTs at all
> by setting the 'fast' attribute on Pickler. But that won't work if you
> have a recursive structure, or have code that requires that the
> identity of objects to be preserved.

We definitely have some cross links amongst the objects, so we need PUTs.

> By the way, it is weird that the total memory usage of the data
> structure is smaller than the size of its respective pickle stream.
> What pickle protocol are you using?

Its highest protocol, but we have a bunch of extension types that
get expanded into python tuples for pickling.

Previous message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
Next message: [Python-Dev] Unpickling memory usage problem, and a proposed solution
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-Dev mailing list