issue2636-24 : Code : Python
lp:~pythonregexp2.7/python/issue2636-24
Created by TimeHorse and last modified
Currently, the python Regular Expression Engine drops characters when used findall / finditer with an expression that has a Zero-Width capture group. For example:
>>> [m.groups() for m in re.finditer(
[('', None), (None, 'bc')]
The 'a' has been lost because the engine first matches the (^z*) with zero-width and then consumes the current character (the 'a'). It then proceeds to match the rest of the expression, which it does with (\w+), resulting in 'bc'. The problem is that firstly, the 'a' should not be consumed by the zero-width match (^z*). But, that would lead to infinite matches of zero-width. So, secondly, one would have to give each iteration an internal state that would indicate whether the it would allow a Zero-width match. Initially, any string will match a Zero-Width expression once, but when that same position is entered, the 'Zero-width match' flag would be true and a subsequent Zero-width match would be disallowed. This item is based on the work from Issue 1647489.
- Get this branch:
- bzr branch lp:~pythonregexp2.7/python/issue2636-24
Branch merges
Related bugs
Related blueprints
Branch information
Recent revisions
- 39039. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39038. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39037. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39036. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39035. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39034. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39033. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39032. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39031. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
- 39030. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
Branch metadata
- Branch format:
- Branch format 6
- Repository format:
- Bazaar pack repository format 1 with rich root (needs bzr 1.0)