Issue33566
Created on 2018-05-18 08:06 by mamamiaibm, last changed 2022-04-11 14:59 by admin. This issue is now closed.
| Messages (6) | |||
|---|---|---|---|
| msg317013 - (view) | Author: Min (mamamiaibm) | Date: 2018-05-18 08:06 | |
Firstly, I wrote something like this:
patn = r"\bROW\s*\((\d+|\*)\)(.|\s)*?\)"
newlines = re.sub(patn, "\nYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY\n", newlines)
but if the file(or string) ended without the expected ")" the code deadlock there, no progress, no exception, and no exit.
Then I changed it to :
patn = r"\bROW\s*\((\d+|\*)\)(.|\s)*?(\)|$)"
newlines = re.sub(patn, "\nYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY\n", newlines) to enforce the rule of end of file. then everything ok.
I felt this is a but, coz RE should not die, it should exit if can't match.
it is Py3.5 on ubuntu. Thanks!
|
|||
| msg317015 - (view) | Author: Min (mamamiaibm) | Date: 2018-05-18 08:09 | |
Sorry, forgot I have upgraded to 3.6.2, not 3.5 |
|||
| msg317017 - (view) | Author: Min (mamamiaibm) | Date: 2018-05-18 08:19 | |
Sorry again, the sample code offered is issue of re.sub(), not findall() :o))) |
|||
| msg317042 - (view) | Author: Matthew Barnett (mrabarnett) * ![]() |
Date: 2018-05-18 17:47 | |
You don't give the value of 'newlines', but the problem is probably catastrophic backtracking, not deadlock. |
|||
| msg317043 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2018-05-18 17:56 | |
Min, you need to give a complete example other people can actually run for themselves. Offhand, this part of the regexp (.|\s)* all by itself _can_ cause exponential-time behavior. You can run this for yourself: >>> import re >>> p = r"(.|\s)*K" >>> re.search(p, " " * 10) # fast >>> re.search(p, " " * 15) # fast >>> re.search(p, " " * 20) # obviously takes a bit of time >>> re.search(p, " " * 21) # very obviously takes time >>> re.search(p, " " * 22) # over a second >>> re.search(p, " " * 25) # about 10 seconds Etc. |
|||
| msg322599 - (view) | Author: Tim Peters (tim.peters) * ![]() |
Date: 2018-07-29 00:18 | |
Closing as not-a-bug - not enough info to reproduce, but the regexp looked prone to exponential-time backtracking to both MRAB and me, and there's been no response to requests for more info. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:00 | admin | set | github: 77747 |
| 2018-07-29 00:18:40 | tim.peters | set | status: open -> closed components: + Regular Expressions nosy:
+ ezio.melotti |
| 2018-05-18 17:56:58 | tim.peters | set | nosy:
+ tim.peters messages: + msg317043 |
| 2018-05-18 17:47:19 | mrabarnett | set | nosy:
+ mrabarnett messages: + msg317042 |
| 2018-05-18 08:19:22 | mamamiaibm | set | messages: + msg317017 |
| 2018-05-18 08:09:57 | mamamiaibm | set | messages:
+ msg317015 versions: + Python 3.6, - Python 3.5 |
| 2018-05-18 08:06:05 | mamamiaibm | create | |

