[Python-Dev] Where to report performance of re? [was: (no subject)]
Stephen J. Turnbull
turnbull.stephen.fw at u.tsukuba.ac.jp
Fri Dec 29 05:25:05 EST 2017
More information about the Python-Dev mailing list
Fri Dec 29 05:25:05 EST 2017
- Previous message (by thread): [Python-Dev] (no subject)
- Next message (by thread): [Python-Dev] Heap allocate type structs in native extension modules?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Franklin? Lee writes: > On Tue, Dec 26, 2017 at 2:01 AM, Yogev Hendel <yogev at intsights.com> wrote: > > > > I don't know if this is the right place to put this, > > but I've found the following lines of code results in an incredibly long > > processing time. > (I think the correct place is python-list. python-dev is primarily for > the developers of Python itself. I read this as a report of a performance problem, perhaps a regression. IMO YMMV, Python-list would be appropriate only if the reporter was already aware that some regular expressions are expected to be disastrously non-performant on some target strings, and wanted help dealing with that. That doesn't seem to be the case (and your response also presumes ignorance of the performance properties of re). > Bug reports and feature requests belong in > https://bugs.python.org/ (where your post could also have gone).) If you really meant that, you should have avoided rewarding him with a wordy explanation. ;-) N.B. I've bookmarked yours, it's excellent! Thank you very much! I think even in this case an experienced community member probably should report on the tracker, though. If all they got was, "WONTFIX: that's how the algorithm works" there, *then* they go to python-list. > (assuming P != NP): Perl allows (introduced?) backreferences, which I've been using them in Emacsen since 1987 or so. > are NP-hard[1]. Perl also added some other features which complicate > things, but backreferences are enough. > > The user-level solution is to understand how regexes are executed, and > to work around it. This is the Pythonic approach<0.5 wink/>, in the sense that we haven't gone to the trouble of trying to improve the algorithm yet. > 2. In theory, we can use the textbook algorithm when possible, and the > backtracking algorithm when necessary. However, the textbook version > won't necessarily be faster, and may take more time to create, so > there's a tradeoff here. There may be a question of the difficulty of maintenance, as well. The linear time algorithms are somewhat more delicate, and we're still occasionally discussing what semantics a regular expression should have (null matches), let alone whether they're implemented correctly.
- Previous message (by thread): [Python-Dev] (no subject)
- Next message (by thread): [Python-Dev] Heap allocate type structs in native extension modules?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list