Issue11258
Created on 2011-02-20 16:03 by jonash, last changed 2022-04-11 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| faster-find-library1.diff | jonash, 2011-02-20 16:03 | does all filtering in Python (more I/O) | ||
| faster-find-library2.diff | jonash, 2011-02-20 16:04 | Pre-filters ldconfig's output using grep (less I/O but requires grep) | ||
| faster-find-library1-py3k.diff | pitrou, 2011-02-20 16:35 | |||
| faster-find-library1-py3k-with-escaped-name.diff | jonash, 2011-02-24 15:14 | with re.escape(name) as suggested by Antoine Pitrou | ||
| faster-find-library1-py3k-with-escaped-name-try2.diff | jonash, 2011-02-27 17:10 | |||
| Messages (17) | |||
|---|---|---|---|
| msg128910 - (view) | Author: Jonas H. (jonash) * | Date: 2011-02-20 16:03 | |
(This applies to all versions of Python I investigated, although the attached patch is for Python 2.7) I wondered why `import uuid` took so long, so I did some profiling. It turns out that `find_library` wastes at lot of time because of this crazy regular expression in `_findSoname_ldconfig`. A quick look at the ldconfig source (namely, the print_cache routine which is invoked when you call `ldconfig -p`, http://sourceware.org/git/?p=glibc.git;a=blob;f=elf/cache.c#l127) confirmed my suspicion that the ldconfig's output could easily be parsed without such a regex monster. I attached two patches that fix this problem. Choose one! ;-) The ctypes tests pass with my fixes, and here comes some benchmarking: $ cat benchmark_ctypes.py from ctypes.util import find_library for i in xrange(10): for lib in ['mm', 'c', 'bz2', 'uuid']: find_library(lib) # Current implementation $ time python benchmark_ctypes.py real 0m11.813s ... $ time python -c 'import uuid' real 0m0.625s ... # With my patch applied $ cp /tmp/ctypesutil.py ctypes/util.py $ time python benchmark_ctypes.py real 0m1.785s ... $ time python -c 'import uuid' real 0m0.182s ... |
|||
| msg128911 - (view) | Author: Jonas H. (jonash) * | Date: 2011-02-20 16:06 | |
(might also be related to http://bugs.python.org/issue11063) |
|||
| msg128912 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-20 16:35 | |
Here is the first patch adapted for py3k. |
|||
| msg128913 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-20 16:36 | |
Actually, re.escape is probably still needed around name. |
|||
| msg129313 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-24 21:26 | |
Thanks for the new patch. Looking again, I wonder if there's a reason the original regexp was so complicated. ldconfig output here has lines such as: libBrokenLocale.so.1 (libc6,x86-64, OS ABI: Linux 2.6.9) => /lib64/libBrokenLocale.so.1 libBrokenLocale.so.1 (libc6, OS ABI: Linux 2.6.9) => /lib/libBrokenLocale.so.1 libBrokenLocale.so (libc6,x86-64, OS ABI: Linux 2.6.9) => /usr/lib64/libBrokenLocale.so libBrokenLocale.so (libc6, OS ABI: Linux 2.6.9) => /usr/lib/libBrokenLocale.so Ideally we would factor out the parsing to a separate private function, and have tests for it. |
|||
| msg129315 - (view) | Author: Jonas H. (jonash) * | Date: 2011-02-24 21:34 | |
As far as I can tell, it doesn't matter. We're looking for the part after the => in any case - ignoring the ABI/architecture information - so the regex would chose the first of those entries. |
|||
| msg129512 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-26 08:46 | |
Ok, I think you're right. I've committed the patch to 3.3 in r88639 after having added a minimal test. Is there a full name I should credit? Thank you for contributing! |
|||
| msg129514 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-26 09:40 | |
Reopening and reverted the commit in r88640. The patch changes behaviour by turning the previous unrooted filename ('libc.so.6') into a full path ('/lib64/libc.so.6'). This breaks builds where multiple versions of a library are available and only one is loadable, e.g. 32-bit builds on 64-bit machines: ====================================================================== ERROR: test_load (ctypes.test.test_loading.LoaderTest) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home2/buildbot2/slave/hg-3.x.loewis-parallel/build/Lib/ctypes/test/test_loading.py", line 26, in test_load CDLL(libc_name) File "/home2/buildbot2/slave/hg-3.x.loewis-parallel/build/Lib/ctypes/__init__.py", line 340, in __init__ self._handle = _dlopen(self._name, mode) OSError: /lib64/libc.so.6: wrong ELF class: ELFCLASS64 |
|||
| msg129627 - (view) | Author: Jonas H. (jonash) * | Date: 2011-02-27 14:25 | |
Humm. Would be great to have the `ldconfig -p` output of such a machine... I can't get ldconfig to recognize 64-bit libraries on my 32-bit machines, so I have no output to test against... |
|||
| msg129630 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-02-27 15:00 | |
Here is an excerpt: libc.so.6 (libc6,x86-64, OS ABI: Linux 2.6.9) => /lib64/libc.so.6 libc.so.6 (libc6, OS ABI: Linux 2.6.9) => /lib/libc.so.6 The "OS ABI" thing is not always there: libdrm.so.2 (libc6,x86-64) => /usr/lib64/libdrm.so.2 libdrm.so.2 (libc6) => /usr/lib/libdrm.so.2 As you see, there are two of them with the same name but in a different path. If you return the absolute path, there is a 50% possibility that you are returning the wrong one ;) There seem to be two key differences between the original implementation and yours: - the orig impl matches the abi_type at the beginning of the parentheses, yours simply ignores the abi_type (that should have caught my eye, but that regex looked so much like magic that I didn't try to make sense of it :-)) - the orig impl returns the file name from the beginning of the matched line, yours returns the full path from the end of the line I guess it should be doable to retain the speed benefit while implementing a matching algorithm closer to the original one. |
|||
| msg129638 - (view) | Author: Jonas H. (jonash) * | Date: 2011-02-27 17:10 | |
> the orig impl matches the abi_type at the beginning of the parentheses, > yours simply ignores the abi_type (that should have caught my eye, but that > regex looked so much like magic that I didn't try to make sense of it :-)) Same here. :) The version I attached seems to work for me. It's some kind of compromise -- basically it's the original regex but with the unneccessary, performance-decreasing cruft stripped away. btw, "Jonas H." is perfectly fine - I don't care about being honored, I just want to `import uuid` without waiting forever. :-) |
|||
| msg134025 - (view) | Author: Jonas H. (jonash) * | Date: 2011-04-19 09:25 | |
*push* Any way to get this into the codebase? |
|||
| msg134309 - (view) | Author: Roundup Robot (python-dev) ![]() |
Date: 2011-04-23 15:56 | |
New changeset 19d9f0a177de by Antoine Pitrou in branch 'default': Issue #11258: Speed up ctypes.util.find_library() under Linux by a factor http://hg.python.org/cpython/rev/19d9f0a177de |
|||
| msg134310 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-23 15:57 | |
I committed a modified patch. Hopefully the buildbots won't break this time :) |
|||
| msg134352 - (view) | Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * ![]() |
Date: 2011-04-24 22:12 | |
19d9f0a177de causes that test_ctypes hangs when test suite is run in Gentoo sandbox. Please reopen this issue. $ sandbox python3.3 -B -m test.regrtest --timeout=10 -v test_ctypes == CPython 3.3a0 (default:020ebe0be33e+, Apr 24 2011, 17:52:58) [GCC 4.5.2] == Linux-${version} == /tmp/test_python_23902 Testing with flags: sys.flags(debug=0, inspect=0, interactive=0, optimize=0, dont_write_bytecode=1, no_user_site=0, no_site=0, ignore_environment=0, verbose=0, bytes_warning=0, quiet=0) [1/1] test_ctypes Timeout (0:00:10)! Thread 0x00007fc205f54700: File "/usr/lib64/python3.3/subprocess.py", line 466 in _eintr_retry_call File "/usr/lib64/python3.3/subprocess.py", line 1412 in _execute_child File "/usr/lib64/python3.3/subprocess.py", line 766 in __init__ File "/usr/lib64/python3.3/ctypes/util.py", line 198 in _findSoname_ldconfig File "/usr/lib64/python3.3/ctypes/util.py", line 206 in find_library File "/usr/lib64/python3.3/ctypes/test/test_find.py", line 15 in <module> File "/usr/lib64/python3.3/ctypes/test/__init__.py", line 64 in get_tests File "/usr/lib64/python3.3/test/test_ctypes.py", line 11 in test_main File "/usr/lib64/python3.3/test/regrtest.py", line 1094 in runtest_inner File "/usr/lib64/python3.3/test/regrtest.py", line 887 in runtest File "/usr/lib64/python3.3/test/regrtest.py", line 587 in _runtest File "/usr/lib64/python3.3/test/regrtest.py", line 711 in main File "/usr/lib64/python3.3/test/regrtest.py", line 1672 in <module> File "/usr/lib64/python3.3/runpy.py", line 73 in _run_code File "/usr/lib64/python3.3/runpy.py", line 160 in _run_module_as_main |
|||
| msg134354 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2011-04-24 22:16 | |
> 19d9f0a177de causes that test_ctypes hangs when test suite is run in > Gentoo sandbox. Please reopen this issue. I'd prefer having a separate issue (which you already opened :-)). The fact that all buildbots work fine after the change suggests to me that the issue is not really the patch I committed. |
|||
| msg134355 - (view) | Author: Arfrever Frehtes Taifersar Arahesis (Arfrever) * ![]() |
Date: 2011-04-24 22:18 | |
OK. We will use issue #11915. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:13 | admin | set | github: 55467 |
| 2011-04-24 22:18:57 | Arfrever | set | messages: + msg134355 |
| 2011-04-24 22:16:07 | pitrou | set | messages: + msg134354 |
| 2011-04-24 22:12:57 | Arfrever | set | nosy:
+ vapier, Arfrever messages: + msg134352 |
| 2011-04-24 14:20:47 | pitrou | set | status: pending -> closed |
| 2011-04-23 15:57:05 | pitrou | set | status: open -> pending resolution: fixed messages: + msg134310 stage: patch review -> resolved |
| 2011-04-23 15:56:25 | python-dev | set | nosy:
+ python-dev messages: + msg134309 |
| 2011-04-19 09:25:55 | jonash | set | messages: + msg134025 |
| 2011-02-27 17:10:09 | jonash | set | files:
+ faster-find-library1-py3k-with-escaped-name-try2.diff nosy: theller, pitrou, jonash messages: + msg129638 |
| 2011-02-27 15:00:46 | pitrou | set | nosy:
theller, pitrou, jonash messages: + msg129630 |
| 2011-02-27 14:25:29 | jonash | set | nosy:
theller, pitrou, jonash messages: + msg129627 |
| 2011-02-26 09:40:35 | pitrou | set | status: closed -> open nosy: theller, pitrou, jonash messages: + msg129514 resolution: fixed -> (no value) |
| 2011-02-26 08:46:25 | pitrou | set | status: open -> closed nosy: theller, pitrou, jonash messages: + msg129512 assignee: theller -> |
| 2011-02-24 21:34:35 | jonash | set | nosy:
theller, pitrou, jonash messages: + msg129315 |
| 2011-02-24 21:26:54 | pitrou | set | nosy:
theller, pitrou, jonash messages: + msg129313 |
| 2011-02-24 15:14:08 | jonash | set | files:
+ faster-find-library1-py3k-with-escaped-name.diff nosy: theller, pitrou, jonash |
| 2011-02-20 16:36:35 | pitrou | set | nosy:
theller, pitrou, jonash messages: + msg128913 |
| 2011-02-20 16:35:29 | pitrou | set | files:
+ faster-find-library1-py3k.diff nosy: theller, pitrou, jonash messages: + msg128912 |
| 2011-02-20 16:10:43 | pitrou | set | nosy:
+ pitrou stage: patch review versions: - Python 2.6, Python 2.5, Python 3.1, Python 2.7, Python 3.2 |
| 2011-02-20 16:06:24 | jonash | set | nosy:
theller, jonash messages: + msg128911 |
| 2011-02-20 16:04:57 | jonash | set | files:
+ faster-find-library2.diff nosy: theller, jonash |
| 2011-02-20 16:03:58 | jonash | create | |

