[Python-ideas] Ideas for improving the struct module
Daniel Spitz
spitz.dan.l at gmail.com
Wed Jan 18 12:08:02 EST 2017
More information about the Python-ideas mailing list
Wed Jan 18 12:08:02 EST 2017
- Previous message (by thread): [Python-ideas] Ideas for improving the struct module
- Next message (by thread): [Python-ideas] Ideas for improving the struct module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
+1 on the idea of supporting variable-length strings with the length encoded in the preceding packed element! Several months ago I was trying to write a parser and writer of PostgreSQL's COPY ... WITH BINARY format. I started out trying to implement it in pure python using the struct module. Due to the existence of variable-length strings encoded in precisely the way you mention, it was not possible to parse an entire row of data without invoking any pure-python-level logic. This made the implementation infeasibly slow. I had to switch to using cython to get it done fast enough (implementation is here: https://github.com/spitz-dan-l/postgres-binary-parser). I believe that with this single change ($, or whatever format specifier one wishes to use), assuming it were implemented efficiently in c, I could have avoided using cython and gotten a satisfactory level of performance with the struct module and python/numpy's already-performant bytestring manipulation faculties. -Dan Spitz On Wed, Jan 18, 2017 at 5:32 AM Elizabeth Myers <elizabeth at interlinked.me> wrote: > Hello, > > I've noticed a lot of binary protocols require variable length > bytestrings (with or without a null terminator), but it is not easy to > unpack these in Python without first reading the desired length, or > reading bytes until a null terminator is reached. > > I've noticed the netstruct library > (https://github.com/stendec/netstruct) has a format specifier, $, which > assumes the previous type to pack/unpack is the string's length. This is > an interesting idea in of itself, but doesn't handle the null-terminated > string chase. I know $ is similar to pascal strings, but sometimes you > need more than 255 characters :p. > > For null-terminated strings, it may be simpler to have a specifier for > those. I propose 0, but this point can be bikeshedded over endlessly if > desired ;) (I thought about using n/N but they're :P). > > It's worth noting that (maybe one of?) Perl's equivalent to the struct > module, whose name escapes me atm, has a module which can handle this > case. I can't remember if it handled variable length or zero-terminated > though; maybe it did both. Perl is more or less my 10th language. :p > > This pain point is an annoyance imo and would greatly simplify a lot of > code if implemented, or something like it. I'd be happy to take a look > at implementing it if the idea is received sufficiently warmly. > > -- > Elizabeth > _______________________________________________ > Python-ideas mailing list > Python-ideas at python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.python.org/pipermail/python-ideas/attachments/20170118/d580fb85/attachment.html>
- Previous message (by thread): [Python-ideas] Ideas for improving the struct module
- Next message (by thread): [Python-ideas] Ideas for improving the struct module
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-ideas mailing list