My Big Dict.
Christian Tismer
tismer at tismer.com
Fri Jul 4 21:09:39 EDT 2003
More information about the Python-list mailing list
Fri Jul 4 21:09:39 EDT 2003
- Previous message (by thread): My Big Dict.
- Next message (by thread): My Big Dict.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Paul Simmonds wrote: [some alternative implementations] > I've done some timings on the functions above, here are the results: > > Python2.2.1, 200000 line file(all data lines) > try/except with split: 3.08s > if with slicing: 2.32s > try/except with slicing: 2.34s > > So slicing seems quicker than split, and using if instead of > try/except appears to speed it up a little more. I don't know how much > faster the current version of the interpreter would be, but I doubt > the ranking would change much. Interesting. I doubt that split() itself is slow, instead I believe that the pure fact that you are calling a function instead of using a syntactic construct makes things slower, since method lookup is not so cheap. Unfortunately, split() cannot be cached into a local variable, since it is obtained as a new method of the line, all the time. On the other hand, the same holds for the find method... Well, I wrote a test program and figured out, that the test results were very dependant from the order of calling the functions! This means, the results are not independent, probably due to the memory usage. Here some results on Win32, testing repeatedly... D:\slpdev\src\2.2\src\PCbuild>python -i \python22\py\testlines.py >>> test() function test_index for 200000 lines took 1.064 seconds. function test_find for 200000 lines took 1.402 seconds. function test_split for 200000 lines took 1.560 seconds. >>> test() function test_index for 200000 lines took 1.395 seconds. function test_find for 200000 lines took 1.502 seconds. function test_split for 200000 lines took 1.888 seconds. >>> test() function test_index for 200000 lines took 1.416 seconds. function test_find for 200000 lines took 1.655 seconds. function test_split for 200000 lines took 1.755 seconds. >>> For that reason, I added a command line mode for testing single functions, with these results: D:\slpdev\src\2.2\src\PCbuild>python \python22\py\testlines.py index function test_index for 200000 lines took 1.056 seconds. D:\slpdev\src\2.2\src\PCbuild>python \python22\py\testlines.py find function test_find for 200000 lines took 1.092 seconds. D:\slpdev\src\2.2\src\PCbuild>python \python22\py\testlines.py split function test_split for 200000 lines took 1.255 seconds. The results look much more reasonable; the index thing still seems to be optimum. Then I added another test, using an unbound str.index function, which was again a bit faster. Finally, I moved the try..except clause out of the game, by using an explicit, restartable iterator, see the attached program. D:\slpdev\src\2.2\src\PCbuild>python \python22\py\testlines.py index3 function test_index3 for 200000 lines took 0.997 seconds. As a side result, split seems to be unnecessarily slow. cheers - chris -- Christian Tismer :^) <mailto:tismer at tismer.com> Mission Impossible 5oftware : Have a break! Take a ride on Python's Johannes-Niemeyer-Weg 9a : *Starship* http://starship.python.net/ 14109 Berlin : PGP key -> http://wwwkeys.pgp.net/ work +49 30 89 09 53 34 home +49 30 802 86 56 pager +49 173 24 18 776 PGP 0x57F3BF04 9064 F4E1 D754 C2FF 1619 305B C09C 5A3B 57F3 BF04 whom do you want to sponsor today? http://www.stackless.com/ -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: testlines.py URL: <http://mail.python.org/pipermail/python-list/attachments/20030705/56e834ce/attachment.ksh>
- Previous message (by thread): My Big Dict.
- Next message (by thread): My Big Dict.
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list