pep 277, Unicode filenames & mbcs encoding &c.
Martin v. Löwis
martin at v.loewis.de
Tue Oct 21 17:59:33 EDT 2003
More information about the Python-list mailing list
Tue Oct 21 17:59:33 EDT 2003
- Previous message (by thread): pep 277, Unicode filenames & mbcs encoding &c.
- Next message (by thread): pep 277, Unicode filenames & mbcs encoding &c.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Edward K. Ream" <edreamleo at charter.net> writes: > Am I reading pep 277 correctly? On Windows NT/XP, should filenames always > be converted to Unicode using the mbcs encoding? What do you mean with "should"? "Should Python always..." or "Should the application always"? PEP 277 actually answers neither question. As Vincent explains, nothing changes with respect to using byte strings on the API. The changes only affect Unicode strings passed to functions expecting file names. > For example, > > myFile = unicode(__file__, "mbcs", "strict") > > This seems to work And it has nothing to do with PEP 277: You are not passing myFile to any API function. If you mean to use myFile as a file name, then yes: this is intended to work. However, using plain __file__ directly should also work. > Am I correct that conversions to Unicode (using "mbcs" on Windows) should be > done before passing arguments to os.path.join, os.path.split, > os.path.normpath, etc. ? You should either use only Unicode strings, or only byte strings. The functions of os.path are not all affected by the PEP 277 implementation (although they probably should). > Presumably os.path functions use the default > system encoding to convert strings to Unicode, which isn't likely to be > "mbcs" or anything else useful :-) Right. This is actually unfortunate. > Are there any situations where some other encoding should be used instead on > Windows? If you get data from a cmd.exe Window. > What about other platforms? For instance, does Linux allow non-ascii > file names? Yes, it does. > If so, what encoding should be specified when converting to Unicode? Nobody knows, but the convention is to use the locale's encoding, as returned by locale.getpreferredencoding(). Regards, Martin
- Previous message (by thread): pep 277, Unicode filenames & mbcs encoding &c.
- Next message (by thread): pep 277, Unicode filenames & mbcs encoding &c.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list