[Python-Dev] RE: cloning iterators again
Raymond Hettinger
python at rcn.com
Sun Oct 26 18:17:35 EST 2003
More information about the Python-Dev mailing list
Sun Oct 26 18:17:35 EST 2003
- Previous message: [Python-Dev] cloning iterators again
- Next message: [Python-Dev] RE: cloning iterators again
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> The following is just so beautiful, I have to share it. I have to say, it is a thing of beauty. > I've been thinking about various implementations of Andrew Koenig's > idea of "copyable iterator wrappers", which support a generalization > of Raymond Hettinger's tee(). I've re-read some of the old email on the subject but didn't see what this buys us that we don't already get with the current tee(). When I wrote tee(), I had considered implementing it as a multi-way tee(it, n=2) so you could write a,b,c,d=tee(myiterable, 4). Then, I wracked my brain for use cases and found nothing that warranted: * the additional memory consumption (the current implementation consumes only one pointer per element and it stores them in contiguous memory); * the additional memory management utilization (the underlying list.pop and list.append have already been optimized to avoid excessive malloc/free calls); * or the impact on cache performance (using contiguous memory means that consecutive pops are in the L1 cache at least 90% of the time and using only one word per entry means that a long series of pops is less likely to blow everything else out of the cache). With only two iterators, I can imagine use cases where the two iterators track each other fairly closely. But with multiple iterators, one iterator typically lags far behind (meaning that list(it) is the best solution) or they track within a fixed number of elements of each other (meaning that windowing is the best solution). The itertools example section shows the pure python code for windowing. AFAICT, that windowing code is unbeatable in terms of speed and memory consumption (nearly all the time is spent forming the result tuple). > class Link(object): > """Singly-linked list of (state, value) pairs. . . . > __slots__ = ["state", "value", "next"] One way to implement this is with a type which adds PyHEAD to the space consumption for the three fields. An alternate approach is to use PyMem directly and request space for four fields (including a refcount field). > if state < 0: > self.link = None > raise value Is it kosher to re-raise the exception long after something else make have handled it and the execution context has long since disappeared? > def test(): > """A simple demonstration of the Wrapper class.""" > import random > def gen(): > for i in range(10): > yield i > it = gen() > a, b = tee(it) > b, c = tee(b) > c, d = tee(c) This is very nice. The current tee() increases memory consumption and workload when nested like this. Raymond Hettinger
- Previous message: [Python-Dev] cloning iterators again
- Next message: [Python-Dev] RE: cloning iterators again
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list