unicode filenames
Just
just at xs4all.nl
Sun Feb 16 16:09:41 EST 2003
More information about the Python-list mailing list
Sun Feb 16 16:09:41 EST 2003
- Previous message (by thread): unicode filenames
- Next message (by thread): unicode filenames
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
In article <wzptps13t5.fsf at nono.cs.uu.nl>, Piet van Oostrum <piet at cs.uu.nl> wrote: > >>>>> David Eppstein <eppstein at ics.uci.edu> (DE) wrote: > > DE> Under Mac OS X, the shell displays text (e.g. from cat, or from ls > DE> without the -q option) as utf-8 by default, and the Finder (gui file > DE> browser) uses utf-8 for accented characters in file names. So I infer > DE> that the correct interpretation of filenames under my OS is utf-8. > DE> But other unixes may differ... > > On Mac OS X, it is a bit more complicated. First cat will indeed show the > unicode (utf-8) contents of a file, but ls won't display filenames with > non-ASCII characters right. At least not in 10.1.5. Maybe 10.2 does it better. > Like if my filename is "¤200", ls will display "???200". Although in Terminal.app supports utf-8 in 10.2, what you describe is still true. > Secondly, the filesystem requires the unicode characters to be normalized, > which means that accented characters like "é" will be broken up into "e" > followed by "´". So if the finder has a file with name "é200", the bytes > used in the filename will be 0x65 followed by 0xCC 0x81 (unicode character > 0x301). ls will print this as "e??200". You don't have to worry about that: the file system will _give_ you normalized unicode, but it does the right thing if you feed it non-normalized unicode. Btw. in 2.3 (current CVS, not a1), the file system calls fully support unicode strings on OSX. I've also got a patch pending that makes os.listdir() return unicode strings when appropriate: http://python.org/sf/683592. I think this has a fair chance to make it in. Just
- Previous message (by thread): unicode filenames
- Next message (by thread): unicode filenames
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list