Searching binary data
Tim Peters
tim_one at email.msn.com
Wed Feb 2 22:49:06 EST 2000
More information about the Python-list mailing list
Wed Feb 2 22:49:06 EST 2000
- Previous message (by thread): Searching binary data
- Next message (by thread): JPython mentioned in featured article at JDJ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[Darrell] > Didn't have access to the internet today which forced me to have > a creative thought of my own. Now to find out if I wasted my time. > > The problem is to find patterns in gobs of binary data. > Treat it as a string you see something like this. > MZ\220\000\003\000\000\000\004\000\000\000\377\377 > > I found writing a re for patterns in that, a pain. > What if I wanted r"[\000-\077]". > It won't work because there are nulls in the result and re doesn't > like that. Actually, that works fine (if it didn't, what you just told us you did is not what you actually did). You can't pass a pattern with an actual null to re (minor flaw of the implementation, IMO), but the raw string r"[\000-\077]" doesn't contain an actual null: it contains the 4-character escape sequence "\000", which re converts to a null. >>> p = re.compile(r"[\000-\001]") >>> p.match(chr(0)).span(0) (0, 1) >>> p.match(chr(1)).span(0) (0, 1) >>> print p.match(chr(2)) None >>> > Not to mention all this octal to hex is annoying Hex escapes work fine too: r"[\x00-\x3f]" means the same as the above. > an who knows what trouble Nulls will be. I do: none <wink>. Really, nulls aren't special at all to re. The glitch in *passing* an actual null in the pattern to re has to do with the engine's C interface, which uses a char* for the pattern without an additional count argument. That's as deep as this one goes. > So I wrote an extension to covert everything to hex in the > following format. > 4d5aff000300000004000000ffff0000ff > Now I can treat the whole thing as a string :) That's fine too.
- Previous message (by thread): Searching binary data
- Next message (by thread): JPython mentioned in featured article at JDJ
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list