[Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
Xavier Morel
catch-all at masklinn.net
Thu Mar 7 11:31:03 CET 2013
More information about the Python-Dev mailing list
Thu Mar 7 11:31:03 CET 2013
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 2013-03-07, at 11:08 , Matej Cepl wrote: > On 2013-03-06, 18:34 GMT, Victor Stinner wrote: >> In short, Unicode was rewritten in Python 3.3 for the PEP 393. It's >> not surprising that minor details like singleton differ. You should >> not use "is" to compare strings in Python, or your program will fail >> on other Python implementations (like PyPy, IronPython, or Jython) or >> even on a different CPython version. > > I am sorry, I don't understand what you are saying. Even though > this has been changed to > https://github.com/mcepl/html2text/blob/fix_tests/html2text.py#L90 > the tests still fail. > > But, Amaury is right: the function doesn't make much sense. > However, ... > > when I have “fixed it” from > https://github.com/mcepl/html2text/blob/master/html2text.py#L95 > > def onlywhite(line): > """Return true if the line does only consist of whitespace characters.""" > for c in line: > if c is not ' ' and c is not ' ': > return c is ' ' > return line > > to > https://github.com/mcepl/html2text/blob/fix_tests/html2text.py#L90 > > def onlywhite(line): > """Return true if the line does only consist of whitespace > characters.""" > for c in line: > if c != ' ' and c != ' ': > return c == ' ' > return line The second test looks like some kind of corruption, it's supposedly iterating on the characters of a line yet testing for two spaces? Is it possible that the original was a literal tab embedded in the source code (instead of '\t') and that got broken at some point? According to its name + docstring, the implementation of this method should really be replaced by `return line and line.isspace()` (the first part being to handle the case of an empty line: in the current implementation the line will be returned directly if no whitespace is found, which will be "negative" for an empty line, and ''.isspace() -> false). Does that fix the failing tests?
- Previous message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Next message: [Python-Dev] Difference in RE between 3.2 and 3.3 (or Aaron Swartz memorial)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list