Message197925
| Author | serhiy.storchaka |
|---|---|
| Recipients | Arfrever, benjamin.peterson, brett.cannon, eric.snow, loewis, meador.inge, serhiy.storchaka, terry.reedy, vstinner |
| Date | 2013-09-16.18:27:41 |
| SpamBayes Score | -1.0 |
| Marked as misclassified | Yes |
| Message-id | <1379356061.32.0.278272788036.issue18961@psf.upfronthosting.co.za> |
| In-reply-to |
| Content | |
|---|---|
What about first line? Currently both Python interpreter and the tokenize module decode it from UTF-8 (actually due to bug #18960 Python interprets it twice, in different encodings). PEP 263 says: 1. The complete Python source file should use a single encoding. Embedding of differently encoded data is not allowed and will result in a decoding error during compilation of the Python source code. I conclude that the first line should be decoded with the encoding specified in the second line. We first should read the first line, check if it isn't a comment or contains encoding cookie, otherwise read the second line, determine the encoding, and decode read lines. Perhaps it will untangle issue18960 too. |
|
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2013-09-16 18:27:41 | serhiy.storchaka | set | recipients: + serhiy.storchaka, loewis, brett.cannon, terry.reedy, vstinner, benjamin.peterson, Arfrever, meador.inge, eric.snow |
| 2013-09-16 18:27:41 | serhiy.storchaka | set | messageid: <1379356061.32.0.278272788036.issue18961@psf.upfronthosting.co.za> |
| 2013-09-16 18:27:41 | serhiy.storchaka | link | issue18961 messages |
| 2013-09-16 18:27:41 | serhiy.storchaka | create | |