On 19.10.2021 10:44, Serhiy Storchaka wrote:
>
> Possible solutions (they can be combined):
>
> 1. Add support for the GEORGIAN-PS charset and all other encodings used in libc (issue22679). The problem is that it is difficult to get the official information about these encodings.
As with all encodings we add: there has to be a real need to support
them natively in Python (as opposed to installing codecs via PyPI)
and we need a definite source for the encoding, e.g. a standards
document from an official body.
IMO, we should not really add more encodings to the stdlib, but instead
point people to e.g. the iconv package:
https://pypi.org/project/python-iconv/
Perhaps we ought to make it easier for such packages to provide
additional codecs even during the startup phase, e.g. via a special
env var which points Python to a list of codec packages to load
prior to initializing the I/O encoding... not sure whether this is
possible, though.
> 2. Falls back to utf-8 or ascii+surrogateescape in case of unsupported locale encoding. But typos can slip unnoticed.
I think this would be a more general solution to such cases, provided
the startup logic issues a visible warning about the fallback. |