bpo-36778: cp65001 encoding becomes an alias to utf_8 by vstinner · Pull Request #13230 · python/cpython

@vstinner

@vstinner

@vstinner

I reproduced #13110 (comment) benchmark:

Mean +- std dev: [ref] 156 ns +- 3 ns -> [remove] 105 ns +- 3 ns: 1.48x faster (-32%)

@vstinner

@methane: Are you ok to simply remove cp65001.py?

@paulmon

I can verify that this change fixes the issue with test_startup_imports I found on Windows IoT Core ARM32 as expected.

Also, all test_site and test_codec tests pass on Windows IoT Core with a default codepage of 65001

methane

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@methane: Are you ok to simply remove cp65001.py?

Yes.

This was referenced

May 10, 2019

serhiy-storchaka

+-----------------+--------------------------------+--------------------------------+
| cp65001 | | Windows only: Windows UTF-8 |
| | | (``CP_UTF8``) |
| cp65001 | | Alias to ``utf_8`` encoding |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the versionchanged directive.

I think it is better to remove this row add add cp65001 to the list of utf-8 aliases.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created PR #13240 for your doc change proposal.


@unittest.skipUnless(sys.platform == 'win32',
'cp65001 is a Windows-only codec')
class CP65001Test(ReadTest, unittest.TestCase):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does utf-8 pass these tests?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I confirmed utf-8 passes this test.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does utf-8 pass these tests?

Sorry, I forgot to specify before I merged my PR that yes: I tested on my Windows 10 and the test still passed. But CP65001Test is now redundant with UTF8Ttest.

I confirmed utf-8 passes this test.

Thanks for checking :-)

@taleinat

Shouldn't this comment in CodePageTest in Lib/test/test_codecs.py also be removed? It no longer makes any sense.

class CodePageTest(unittest.TestCase):
    # CP_UTF8 is already tested by CP65001Test
    CP_UTF8 = 65001

@vstinner

Shouldn't this comment in CodePageTest in Lib/test/test_codecs.py also be removed? It no longer makes any sense.

Right. I wrote PR #13807 to remove the comment.

icanhasmath added a commit to ActiveState/cpython that referenced this pull request

Aug 9, 2024

icanhasmath added a commit to ActiveState/cpython that referenced this pull request

Aug 9, 2024