Message283271
| Author | vstinner |
|---|---|
| Recipients | belopolsky, ezio.melotti, jcea, lemburg, sdaoden, serhiy.storchaka, vstinner |
| Date | 2016-12-15.09:53:01 |
| SpamBayes Score | -1.0 |
| Marked as misclassified | Yes |
| Message-id | <1481795581.59.0.160043064791.issue11322@psf.upfronthosting.co.za> |
| In-reply-to |
| Content | |
|---|---|
It seems like encodings.normalize_encoding() currently has no unit test! Before modifying it, I would prefer to see a few unit tests: * " utf 8 " * "UtF 8" * "utf8\xE9" * etc. Since we are talking about an optimmization, I would like to see a benchmark result before/after. I also would like to test Marc-Andre's idea of exposing the C function _Py_normalize_encoding(). _Py_normalize_encoding() works on a byte string encoded to Latin1. To implement encodings.normalize_encoding(), we might rewrite the function to work on Py_UCS4 character, or have a fast version on char*, and a more generic version for UCS2 and UCS4? |
|
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2016-12-15 09:53:01 | vstinner | set | recipients: + vstinner, lemburg, jcea, belopolsky, ezio.melotti, sdaoden, serhiy.storchaka |
| 2016-12-15 09:53:01 | vstinner | set | messageid: <1481795581.59.0.160043064791.issue11322@psf.upfronthosting.co.za> |
| 2016-12-15 09:53:01 | vstinner | link | issue11322 messages |
| 2016-12-15 09:53:01 | vstinner | create | |