issue2636-22 : Code : Python

lp:~pythonregexp2.7/python/issue2636-22

Created by TimeHorse and last modified

The current implementation of the Python Regular Expression Engine does not allow an expression to be found or split if the length of the Regular Expression match is of zero-length. Rather than splitting the given string as expected, it simply returns the original string in a single-element array, rather than an array containing each character as an element of the output list, as well as an empty string to represent the "first" and "last" characters of a string. For example:

>>> re.split(r'\b', 'a b')
['a b']

When one would expect this to return the list [ '', 'a', ' ', 'b', '' ]. Because some existing python code may expect the single-element unsplit list as a result of a Zero-Width expression, it is recommended that, at least initially, we provide a flag, re.ZEROWIDTH and re.Z as well as an In-Line flag (?z) to enable this behaviour. It may also be possible to add a 'from __future__ import ZeroWidthRegularExpressions" to enable this behaviour by default if this functionality is considered best the best long-term solution. This item is based on the work from Issues 3262, 988761 and 852532.

Get this branch:
bzr branch lp:~pythonregexp2.7/python/issue2636-22

Branch merges

Related bugs

Related blueprints

Branch information

Recent revisions

39039. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39038. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39037. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39036. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39035. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39034. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39033. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39032. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39031. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>
39030. By Jeffrey C. "The TimeHorse" Jacobs <email address hidden>

Branch metadata

Branch format:
Branch format 6
Repository format:
Bazaar pack repository format 1 with rich root (needs bzr 1.0)