why won't slicing lists raise IndexError?

Cameron Simpson cs at cskk.id.au
Mon Dec 4 17:02:03 EST 2017

Previous message (by thread): why won't slicing lists raise IndexError?
Next message (by thread): why won't slicing lists raise IndexError?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 04Dec2017 14:13, Jason Maldonis <jjmaldonis at gmail.com> wrote:
>And I'll be honest -- I like the implementation of the LazyList I wrote
>above. I think it's pretty logical, because it allows you to think about
>the lazy list like this:  "Treat the list like a norma list. If you run out
>of bounds, get more data, then treat the list like a normal list again."
>And I really like that clean logic.

Yes, it is very attractive. I've got a buffer class with that behaviour which 
is very useful for parsing data streams.

I think the salient difference between your LazyList's API and my buffer's API 
is that in mine, the space allocation is a distinct operation from the 
__getitem__.

My buffer class keeps an internal buffer of unconsumed data, and its 
__getitem__ indexes only that. It has two methods for "on demand": .extend(), 
which ensures that the internal buffer has at least n bytes, and .take(), which 
does a .extend and then returns the leading n bytes (trimming them from the 
internal buffer of course).

So if I need to inspect the buffer I do a .extend to ensure there's enough 
data, then one can directly use [] to look at stuff, or use .take to grab a 
known size chunk.

The .extend and .take operation accept an optional "short_ok" boolean, default 
False.  When false, .extend and .take raise an exception if there aren't enough 
data available, otherwise they can return with a short buffer containing what 
was available.

That covers your slice situation: true implies Python-like slicing and false 
(the default) acts like you (and I) usually want: an exception.

And it sidesteps Python's design decisions because it leaves the __getitem__ 
semantics unchanged. One ensures the require preconditions (enough data) by 
calling .extend _first_.

Here's some example code parsing an ISO14496 Box header record, which has 2 4 
byte values at the start. The code uses short_ok to probe for immediate 
end-of-input. But if there are data, it uses .extend and .take in default mode 
so that an _incomplete_ record raises an exception:

  def parse_box_header(bfr):
    ''' Decode a box header from the CornuCopyBuffer `bfr`. Return (box_header, new_buf, new_offset) or None at end of input.
    '''
    # return BoxHeader=None if at the end of the data
    bfr.extend(1, short_ok=True)
    if not bfr:
      return None
    # note start point
    offset0 = bfr.offset
    user_type = None
    bfr.extend(8)
    box_size, = unpack('>L', bfr.take(4))
    box_type = bfr.take(4)

In summary, I think you should give your LazyList a .extend method to establish 
your code's precondition (enough elements) and then you can proceed with your 
slicing in surety that it will behave correctly.

Cheers,
Cameron Simpson <cs at cskk.id.au> (formerly cs at zip.com.au)

Previous message (by thread): why won't slicing lists raise IndexError?
Next message (by thread): why won't slicing lists raise IndexError?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-list mailing list