[Python-Dev] Unicode 8.0 and 3.5
Steven D'Aprano
steve at pearwood.info
Fri Jun 19 01:56:44 CEST 2015
More information about the Python-Dev mailing list
Fri Jun 19 01:56:44 CEST 2015
- Previous message (by thread): [Python-Dev] Unicode 8.0 and 3.5
- Next message (by thread): [Python-Dev] Unicode 8.0 and 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Thu, Jun 18, 2015 at 08:34:14PM +0100, MRAB wrote: > On 2015-06-18 19:33, Larry Hastings wrote: > >On 06/18/2015 11:27 AM, Terry Reedy wrote: > >>Unicode 8.0 was just released. Can we have unicodedata updated to > >>match in 3.5? > >> > > > >What does this entail? Data changes, code changes, both? > > > It looks like just data changes. At the very least, there is a change to the casefolding algorithm. Cherokee was classified as unicameral but is now considered bicameral (two cases, like English). Unusually, case-folding Cherokee maps to uppercase rather than lowercase. The full set of changes is listed here: http://unicode.org/versions/Unicode8.0.0/ Apart from the addition of 7716 characters and changes to str.casefold(), I don't think any of the changes will make a big difference to Python's implementation. But it would be good to support Unicode 8 (to the degree that Python actually does support Unicode, rather than just that character set part of it). > There are additional codepoints and a renamed property (which the > standard library doesn't support anyway). Which one are you referring to, Indic_Matra_Category renamed to Indic_Positional_Category? -- Steve
- Previous message (by thread): [Python-Dev] Unicode 8.0 and 3.5
- Next message (by thread): [Python-Dev] Unicode 8.0 and 3.5
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list