Python 2.2 re bug?
Travis Shirk
travis at puddy.lan.kerrgulch.net
Sun Aug 25 16:31:51 EDT 2002
More information about the Python-list mailing list
Sun Aug 25 16:31:51 EDT 2002
- Previous message (by thread): Security hole in rexec?
- Next message (by thread): Python 2.2 re bug?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
> On Sat, 24 Aug 2002 23:46:10 +0200, Travis Shirk wrote: >> >> Using Python 1.5.2: >> import re; >> data = "\xFF\x00\xE0\xD3\xD3\xE4\x95\xFF\x00\x00\x11\xFF\x00\xF5" data1 >> = re.compile(r"\xFF\x00([\xE0-\xFF])").sub(r"\xFF\1", data); print data1 >> '\377\340\323\323\344\225\377\000\000\021\377\365' >> >> >> This output is exactly what I expect, but now see what happens in 2.2.1: >> import re; >> data = "\xFF\x00\xE0\xD3\xD3\xE4\x95\xFF\x00\x00\x11\xFF\x00\xF5" data1 >> = re.compile(r"\xFF\x00([\xE0-\xFF])").sub(r"\xFF\1", data); print data1 >> '\\xFF\xe0\xd3\xd3\xe4\x95\xff\x00\x00\x11\\xFF\xf5' >> >> Pedro Rodriguez <pedro_rodriguez at club-internet.fr> wrote: > I had some issue about this topic and I wonder if your problem does not > come like me from the raw string stuff. Here goes my reasoning FWIW. > When you write something like : > r"\x00" > this actual means : > ['\\', 'x', '0', '0'] (use list(r"\x00")) > but > "\x00" > means > ['\x00'] (using list("\x00")) > By using raw string you prevent the python parser from replacing the > proper character in the string. And the 're' module isn't supposed to do > this kind of substitution, it has its own things to do with '\'. > So you should probably fix your expression by - carefully - replacing : > data1 = re.compile(r"...").sub(r"...") > with > data1 = re.compile("...").sub("...") > in both 1.5.2 and 2.x version. Okay to reclarify, 1.5.2 works for me as expected. I need r"" in the compile and sub arguments because both are regular expressions. If I make both a regular string I don't get duplicated \\ characters, but the \1 in the sub argument does not refer to group one of the compiled regex. Not that I would expect it to. The bottom line is that the behavior between 1.5.2 and 2.2.1 is differerent, and unless there is a workaround 2.2.1 seems broken. Travis -- -- Travis Shirk <travis at pobox dot com>
- Previous message (by thread): Security hole in rexec?
- Next message (by thread): Python 2.2 re bug?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list