Attached is a proposed patch.
Some explanation behind the patch that stems from the above comments:
The following is an example of Formatter.format() returning str in the current implementation that would break if we made Formatter.format() return unicode whenever format_string is unicode:
>>> f.format(u"{0}", "\xc3\xa9") # UTF-8 encoded "e-acute".
'\xc3\xa9'
(It would break with a UnicodeDecodeError because 'ascii' is the default encoding.)
Since we can't change Formatter.format(format_string) to return unicode whenever format_string is unicode without breaking existing code, I believe the best we can do is to document the departure from PEP 3101. Since the caller has to handle return values of type str anyways, I don't think it helps to ensure that more return values are unicode. |