Issue 45653: Freeze the encodings module.
Created on 2021-10-28 18:11 by eric.snow, last changed 2022-04-11 14:59 by admin.
Messages (8)
msg405211 - (view)
Author: Eric Snow (eric.snow) *
Date: 2021-10-28 18:11
Date: 2021-10-28 18:15
Date: 2021-10-28 19:56
Date: 2021-10-30 15:54
Date: 2021-10-30 18:00
Date: 2021-10-31 00:43
Date: 2021-11-29 20:27
Date: 2021-12-13 18:01
Date: 2021-10-28 18:11
Currently we freeze all the modules imported during runtime initialization, except for the encodings module. It has a lot of submodules and this results in a lot of extra noise in builds. We hadn't frozen it yet because we were still ironing out changes related to frozen modules and the extra noise was a pain. We also waited because we weren't sure if we should freeze all the submodules or just the most likely ones to be used during startup. In the case of the latter, we were also blocked on having __path__ set on the package. At this point there are no blockers. So we should freeze the encodings modules with either all submodules or the most commonly used subset.msg405213 - (view) Author: Marc-Andre Lemburg (lemburg) *
Date: 2021-10-28 18:15
encodings is a package. I think you first have to check whether mixing frozen and non-frozen submodules are even supported. I've never tried having only part of a package frozen. Freezing the whole package certainly works.msg405247 - (view) Author: Eric Snow (eric.snow) *
Date: 2021-10-28 19:56
On Thu, Oct 28, 2021 at 12:15 PM Marc-Andre Lemburg <report@bugs.python.org> wrote: > encodings is a package. I think you first have to check whether mixing > frozen and non-frozen submodules are even supported. I've never tried > having only part of a package frozen. It works as long as __path__ is set properly, which it is now. FWIW, I tested freezing only some of the submodules a while back and it worked fine. That was using a different branch that I never merged but it should be fine with the different change that got merged. Of course, we'd need to verify that if we went that route.msg405372 - (view) Author: Filipe Laíns (FFY00) *
Date: 2021-10-30 15:54
I just tested partially freezing the package, and it seems to working fine :)msg405383 - (view) Author: Marc-Andre Lemburg (lemburg) *
Date: 2021-10-30 18:00
On 30.10.2021 17:54, Filipe Laíns wrote: > > I just tested partially freezing the package, and it seems to working fine :) FWIW: I think it's best not bother and simply freeze the whole thing. It's mostly char mappings which compress well and there's a benefit in sharing these using mmap (which the OS does for you with static C data).msg405390 - (view) Author: Filipe Laíns (FFY00) *
Date: 2021-10-31 00:43
I have already opened up the PR, but I can change if desired.msg407323 - (view) Author: Guido van Rossum (gvanrossum) *
Date: 2021-11-29 20:27
New changeset 02b5ac6091ada0c2df99c4e1eae37ddccbcd91f0 by Kumar Aditya in branch 'main': bpo-45653: fix test_embed on windows (GH-29814) https://github.com/python/cpython/commit/02b5ac6091ada0c2df99c4e1eae37ddccbcd91f0msg408474 - (view) Author: Christian Heimes (christian.heimes) *
Date: 2021-12-13 18:01
Eric, I have a simple reproducer for the issue:
This works:
$ LC_ALL=en_US.utf-8 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config
This fails because it cannot load ISO-8859-1 / latin-1 codec
$ LC_ALL=en_US.latin1 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = 'conf_program_name'
isolated = 0
environment = 1
user site = 1
import site = 1
is in build tree = 0
stdlib dir = ''
sys._base_executable = 'conf_executable'
sys.base_prefix = ''
sys.base_exec_prefix = ''
sys.platlibdir = 'lib'
sys.executable = 'conf_executable'
sys.prefix = ''
sys.exec_prefix = ''
sys.path = [
'/home/heimes/dev/python/cpython/Lib',
'/home/heimes/dev/python/cpython/build/lib.linux-x86_64-3.11',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
LookupError: unknown encoding: ISO-8859-1
Current thread 0x00007f9c42be6740 (most recent call first):
<no Python frame>
With this patch I'm seeing that encodings.__path__ is not absolute and that __spec__ has an empty submodule_search_locations.
--- a/Lib/encodings/__init__.py
+++ b/Lib/encodings/__init__.py
@@ -98,9 +98,12 @@ def search_function(encoding):
# module with side-effects that is not in the 'encodings' package.
mod = __import__('encodings.' + modname, fromlist=_import_tail,
level=0)
- except ImportError:
+ except ImportError as e:
# ImportError may occur because 'encodings.(modname)' does not exist,
# or because it imports a name that does not exist (see mbcs and oem)
+ sys.stderr.write(f"exception: {e}\n")
+ sys.stderr.write(f"encodings.__path__: {__path__}\n")
+ sys.stderr.write(f"encodings.__spec__: {__spec__}\n")
pass
else:
break
$ LC_ALL=en_US.latin1 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config
exception: No module named 'encodings.latin_1'
encodings.__path__: ['encodings']
encodings.__spec__: ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=[])
exception: No module named 'encodings.iso_8859_1'
encodings.__path__: ['encodings']
encodings.__spec__: ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=[])
It should have this search location:
>>> import encodings
>>> encodings.__spec__
ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=['/home/heimes/dev/python/cpython/Lib/encodings'])
History
Date
User
Action
Args
2022-04-11 14:59:51adminsetgithub: 89816
2021-12-13 18:01:38christian.heimessetmessages:
+ msg408474
2021-12-10 15:26:47christian.heimessetnosy:
+ christian.heimes
pull_requests: + pull_request28255
2021-11-29 20:27:42gvanrossumsetnosy: + gvanrossum
messages: + msg407323
2021-11-27 09:32:06kumaradityasetpull_requests: + pull_request28047 2021-11-26 06:54:42kumaradityasetnosy: + kumaraditya
pull_requests: + pull_request28024
2021-10-31 00:43:55FFY00setstage: needs patch -> patch review 2021-10-31 00:43:10FFY00setmessages: + msg405390
stage: patch review -> needs patch 2021-10-30 18:00:11lemburgsetmessages: + msg405383 2021-10-30 15:54:48FFY00setkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request27599 2021-10-30 15:54:14FFY00setmessages: + msg405372 2021-10-28 19:56:41eric.snowsetmessages: + msg405247 2021-10-28 18:15:15lemburgsetnosy: + lemburg
messages: + msg405213
2021-10-28 18:11:52eric.snowcreate
pull_requests: + pull_request28255
2021-11-29 20:27:42gvanrossumsetnosy: + gvanrossum
messages: + msg407323
2021-11-27 09:32:06kumaradityasetpull_requests: + pull_request28047 2021-11-26 06:54:42kumaradityasetnosy: + kumaraditya
pull_requests: + pull_request28024
2021-10-31 00:43:55FFY00setstage: needs patch -> patch review 2021-10-31 00:43:10FFY00setmessages: + msg405390
stage: patch review -> needs patch 2021-10-30 18:00:11lemburgsetmessages: + msg405383 2021-10-30 15:54:48FFY00setkeywords: + patch
stage: needs patch -> patch review
pull_requests: + pull_request27599 2021-10-30 15:54:14FFY00setmessages: + msg405372 2021-10-28 19:56:41eric.snowsetmessages: + msg405247 2021-10-28 18:15:15lemburgsetnosy: + lemburg
messages: + msg405213
2021-10-28 18:11:52eric.snowcreate