[Python-Dev] Bytes path support
"Martin v. Löwis"
martin at v.loewis.de
Tue Aug 26 13:14:23 CEST 2014
More information about the Python-Dev mailing list
Tue Aug 26 13:14:23 CEST 2014
- Previous message: [Python-Dev] Bytes path support
- Next message: [Python-Dev] Bytes path support
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Am 24.08.14 03:11, schrieb Greg Ewing: > Isaac Morland wrote: >> In HTML 5 it allows non-ASCII-compatible encodings as long as U+FEFF >> (byte order mark) is used: >> >> http://www.w3.org/TR/html-markup/syntax.html#encoding-declaration >> >> Not sure about XML. > > According to Appendix F here: > > http://www.w3.org/TR/xml/#sec-guessing > > an XML parser needs to be prepared to try all the encodings it > supports until it finds one that works well enough to decode > the XML declaration, then it can find out the exact encoding > used. That's not what this section says. Instead, it says that you need to auto-detect UCS-4, UTF-16, UTF-8 from the BOM, or guess them or EBCDIC from the encoding of '<?'. This should be enough to actually parse the encoding declaration. Other non-ASCII-compatible encodings can only be used if declared in an upper-level protocol (such as HTTP). The parser is not expected to try out all encodings it supports. Regards, Martin
- Previous message: [Python-Dev] Bytes path support
- Next message: [Python-Dev] Bytes path support
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list