Issue30410
Created on 2017-05-20 08:58 by paul.moore, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 10264 | merged | lys.nikolaou, 2018-10-31 20:01 | |
| PR 11860 | merged | miss-islington, 2019-02-14 23:35 | |
| Messages (12) | |||
|---|---|---|---|
| msg294020 - (view) | Author: Paul Moore (paul.moore) * ![]() |
Date: 2017-05-20 08:58 | |
The documentation for the encoding of sys.stdin/out/err (see https://docs.python.org/3.6/library/sys.html#sys.stdout) does not reflect the change in Python 3.6 on Windows to use the console Unicode APIs, and hence UTF-8 for the encoding. |
|||
| msg294046 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2017-05-20 18:37 | |
How about this?
The character encoding is platform-dependent. Non-Windows
platforms use the locale encoding (see
locale.getpreferredencoding()).
On Windows, UTF-8 is used for console character
devices (i.e. CON, CONIN$, and CONOUT$). However, this
can be overridden to use the console as a generic
character device by setting the environment variable
PYTHONLEGACYWINDOWSSTDIO before starting Python. Non-
character devices such as disk files and pipes use the
system locale encoding (i.e. the ANSI codepage).
Character devices such as NUL (i.e. isatty() returns
True) use the value of the console input and output
codepages at startup, respectively for stdin and
stdout/stderr. This defaults to the system locale
encoding if the process is not initially attached to a
console.
Under all platforms, you can override this value by
setting the PYTHONIOENCODING environment variable before
starting Python. However, for the Windows console, this
only applies when PYTHONLEGACYWINDOWSSTDIO is also set.
|
|||
| msg294061 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2017-05-20 23:59 | |
Looks great, though I wonder whether the rest of the paragraph after "Character devices such as NUL" would be more confusing than it's worth? Can you create a PR? (And having links to the environment variable docs would be great.) |
|||
| msg294063 - (view) | Author: Eryk Sun (eryksun) * ![]() |
Date: 2017-05-21 00:53 | |
I discussed character devices mostly because of the NUL device. It could be surprising that Python dies on an encoding error when output is redirected to NUL:
C:\>chcp 1252
Active code page: 1252
C:\>python -c "print('\u20ac')" > nul
C:\>chcp 437
Active code page: 437
C:\>python -c "print('\u20ac')" > nul
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Python36\lib\encodings\cp437.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u20ac' in position 0:
character maps to <undefined>
Unix has a similar problem:
$ LANG=C python3 -c 'print("\u20ac")' > /dev/null
Traceback (most recent call last):
File "<string>", line 1, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character '\u20ac' in position 0:
ordinal not in range(128)
Except /dev/null isn't a TTY. Also, it's rare nowadays for the locale encoding in Unix systems to be something other than UTF-8.
It would be useful if we special-cased NUL like we do for the Windows console, but just to make it use the backslashreplace error handler. Unfortunately I don't know how to do that without calling NtQueryObject, for which ObjectNameInformation (1) can't be used because it's undocumented [1]. GetFinalPathNameByHandle also can't be used because it requires file-system devices. As a crude workaround, we could lump together all non-console character devices (i.e. isatty() but not a console). That will affect serial devices, too, but I can't think of a good reason someone would redirect stdout or stderr to a COM port.
[1]: https://msdn.microsoft.com/en-us/library/ff550964
|
|||
| msg328766 - (view) | Author: Lysandros Nikolaou (lys.nikolaou) * ![]() |
Date: 2018-10-28 22:24 | |
Shall I create a PR for this? |
|||
| msg328798 - (view) | Author: Steve Dower (steve.dower) * ![]() |
Date: 2018-10-29 10:34 | |
Please do! |
|||
| msg330764 - (view) | Author: Lysandros Nikolaou (lys.nikolaou) * ![]() |
Date: 2018-11-30 09:38 | |
Ping. |
|||
| msg330765 - (view) | Author: Paul Moore (paul.moore) * ![]() |
Date: 2018-11-30 09:58 | |
The proposed wording seems a bit over-complex to me. Maybe the following re-wording would be easier to understand?
The character encoding is platform-dependent. Non-Windows
platforms use the locale encoding (see
locale.getpreferredencoding()).
On Windows, UTF-8 is used for the console device. Non-character
devices such as disk files and pipes use the system locale
encoding (i.e. the ANSI codepage). Non-console character
devices such as NUL (i.e. where isatty() returns True) use the
value of the console input and output codepages at startup,
respectively for stdin and stdout/stderr. This defaults to the
system locale encoding if the process is not initially attached
to a console.
The special behaviour of the console can be overridden
by setting the environment variable PYTHONLEGACYWINDOWSSTDIO
before starting Python. In that case, the console codepages are
used as for any other character device.
Under all platforms, you can override this value by
setting the PYTHONIOENCODING environment variable before
starting Python. However, for the Windows console, this
only applies when PYTHONLEGACYWINDOWSSTDIO is also set.
|
|||
| msg330901 - (view) | Author: Lysandros Nikolaou (lys.nikolaou) * ![]() |
Date: 2018-12-02 22:06 | |
I updated the PR with the new wording by Paul, since I found it easier to understand as well. |
|||
| msg335573 - (view) | Author: miss-islington (miss-islington) | Date: 2019-02-14 23:35 | |
New changeset 5723263a3a39a05b6a2f567e0e7771792e6e2f5b by Miss Islington (bot) (Lysandros Nikolaou) in branch 'master': bpo-30410: Documentation of sys.stdin/out/err update to reflect change in 3.6 (GH-10264) https://github.com/python/cpython/commit/5723263a3a39a05b6a2f567e0e7771792e6e2f5b |
|||
| msg335574 - (view) | Author: Mariatta (Mariatta) * ![]() |
Date: 2019-02-14 23:36 | |
Fixed in 3.8 and 3.7. Thanks! |
|||
| msg335575 - (view) | Author: miss-islington (miss-islington) | Date: 2019-02-14 23:45 | |
New changeset b8bcec35e01cac018f6ccfc8323d35886340efe0 by Miss Islington (bot) in branch '3.7': bpo-30410: Documentation of sys.stdin/out/err update to reflect change in 3.6 (GH-10264) https://github.com/python/cpython/commit/b8bcec35e01cac018f6ccfc8323d35886340efe0 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:46 | admin | set | github: 74595 |
| 2019-02-14 23:45:23 | miss-islington | set | messages: + msg335575 |
| 2019-02-14 23:36:54 | Mariatta | set | status: open -> closed nosy:
+ Mariatta resolution: fixed |
| 2019-02-14 23:35:48 | miss-islington | set | pull_requests: + pull_request11893 |
| 2019-02-14 23:35:28 | miss-islington | set | nosy:
+ miss-islington messages: + msg335573 |
| 2018-12-02 22:06:32 | lys.nikolaou | set | messages: + msg330901 |
| 2018-11-30 09:58:49 | paul.moore | set | messages: + msg330765 |
| 2018-11-30 09:38:07 | lys.nikolaou | set | messages: + msg330764 |
| 2018-10-31 20:01:49 | lys.nikolaou | set | keywords:
+ patch stage: patch review pull_requests: + pull_request9575 |
| 2018-10-29 10:34:17 | steve.dower | set | messages:
+ msg328798 versions: + Python 3.8 |
| 2018-10-28 22:24:59 | lys.nikolaou | set | nosy:
+ lys.nikolaou messages: + msg328766 |
| 2017-05-21 00:53:24 | eryksun | set | messages: + msg294063 |
| 2017-05-20 23:59:14 | steve.dower | set | messages: + msg294061 |
| 2017-05-20 18:37:01 | eryksun | set | nosy:
+ eryksun messages: + msg294046 |
| 2017-05-20 08:58:49 | paul.moore | create | |

