I thought of another way to implement PyUnicode_DecodeFSDefault. If
Py_FileSystemDefaultEncoding is set, decode with the codecs module,
otherwise use UTF-8 + replace. This works because when
Py_FileSystemDefaultEncoding is initialized at the end of
Py_InitializeEx(), the codecs module is ready to be used. Here's what
it looks like:
PyObject*
PyUnicode_DecodeFSDefault(const char *s)
{
Py_ssize_t size = (Py_ssize_t)strlen(s);
/* During the early bootstrapping process, Py_FileSystemDefaultEncoding
can be undefined. If it is case, decode using UTF-8. The
following assumes
that Py_FileSystemDefaultEncoding is set to a built-in encoding
during the
bootstrapping process where the codecs aren't ready yet.
*/
if (Py_FileSystemDefaultEncoding) {
return PyUnicode_Decode(s, size,
Py_FileSystemDefaultEncoding,
"replace");
}
else {
return PyUnicode_DecodeUTF8(s, size, "replace");
}
}
It is not perfect, since the extra function calls in the codecs module
causes test_profile and test_doctest to fail. However, I think this is
much simpler that the previous versions. |