python3, regular expression and bytes text
Serhiy Storchaka
storchaka at gmail.com
Sat Oct 12 15:48:29 EDT 2019
More information about the Python-list mailing list
Sat Oct 12 15:48:29 EDT 2019
- Previous message (by thread): python3, regular expression and bytes text
- Next message (by thread): python3, regular expression and bytes text
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
12.10.19 21:08, Eko palypse пише: > So how can I make it work with utf8 encoded text? You cannot. First, \w in re.LOCALE works only when the text is encoded with the locale encoding (cp1252 in your case). Second, re.LOCALE supports only 8-bit charsets. So even if you set the utf-8 locale, it would not help. Regular expressions with re.LOCALE are slow. It may be more efficient to decode text and use Unicode regular expression.
- Previous message (by thread): python3, regular expression and bytes text
- Next message (by thread): python3, regular expression and bytes text
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list