Efficient python 2-d arrays?
Dan Stromberg
drsalists at gmail.com
Mon Jan 17 17:54:20 EST 2011
More information about the Python-list mailing list
Mon Jan 17 17:54:20 EST 2011
- Previous message (by thread): Efficient python 2-d arrays?
- Next message (by thread): Efficient python 2-d arrays?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Jan 17, 2011 at 2:20 PM, Jake Biesinger <jake.biesinger at gmail.com> wrote: > Hi all, > > Using numpy, I can create large 2-dimensional arrays quite easily. >>>> import numpy >>>> mylist = numpy.zeros((100000000,2), dtype=numpy.int32) > > Unfortunately, my target audience may not have numpy so I'd prefer not to use it. > > Similarly, a list-of-tuples using standard python syntax. >>>> mylist = [(0,0) for i in xrange(100000000) > > but this method uses way too much memory (>4GB for 100 million items, compared to 1.5GB for numpy method). > > Since I want to keep the two elements together during a sort, I *can't* use array.array. >>>> mylist = [array.array('i',xrange(100000000)), array.array('i',xrange(100000000))] > > If I knew the size in advance, I could use ctypes arrays. >>>> from ctypes import * >>>> class myStruct(Structure): >>>> _fields_ = [('x',c_int),('y',c_int)] >>>> mylist_type = myStruct * 100000000 >>>> mylist = mylist_type() > > but I don't know that size (and it can vary between 1 million-200 million), so preallocating doesn't seem to be an option. > > Is there a python standard library way of creating *efficient* 2-dimensional lists/arrays, still allowing me to sort and append? > > Thanks! > -- > http://mail.python.org/mailman/listinfo/python-list > I recently had need of a two dimensional array also, but I only needed small ones, so I just used a dictionary indexed by tuples. If you need to sort a row as an aggregate type, it seems that you probably either want to: 1) Use a list of lists, where the outer list is the easier one to sort by - this is analogous to the array of pointers in C - sorting by the outer would want to compare "elements" by looking at the 0th inner, then 1st inner, etc. So if you can organize things this way, things might be pretty easy. 2) Use a list of instances, where the instances (rows) might just include a list 3) Use array.array with a custom sort so that you can move multiple things when a more common sort would move "one" (possibly an aggregate). If you want some sorting code to start from for #3, I have a variety of sorts written in Python (pure python or cython, your option, each automatically derived from the same m4 file for a given sort) at http://stromberg.dnsalias.org/svn/sorts/compare/trunk/ - however, even the cython timsort provided there doesn't perform quite as well as the standard library's timsort. HTH
- Previous message (by thread): Efficient python 2-d arrays?
- Next message (by thread): Efficient python 2-d arrays?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list