[Python-Dev] Drastically improving list.sort() for lists of strings/ints
Chris Angelico
rosuav at gmail.com
Sun Sep 11 12:14:06 EDT 2016
More information about the Python-Dev mailing list
Sun Sep 11 12:14:06 EDT 2016
- Previous message (by thread): [Python-Dev] Drastically improving list.sort() for lists of strings/ints
- Next message (by thread): [Python-Dev] Drastically improving list.sort() for lists of strings/ints
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Sun, Sep 11, 2016 at 5:45 PM, Elliot Gorokhovsky <elgo8537 at colorado.edu> wrote: > I am interested in making a non-trivial improvement to list.sort(), but > before I put in the work, I want to test the waters and see if this is > something the community would accept. Basically, I want to implement radix > sort for lists of strings. So list.sort() would detect if it is sorting a > list of strings (which is one of the more common things you sort in python) > and, if so, use in-place radix sort (see > https://xlinux.nist.gov/dads/HTML/americanFlagSort.html). In-place radix > sort is significantly faster for lexicographic sorting than Timsort (or in > general any comparison-based sort, since radix can beat the nlogn barrier). > If you don't believe that last claim, suppose for the sake of the argument > that it's true (because if I actually implemented this I could supply > benchmarks to prove it). I'd like to see these benchmarks, actually. Sounds interesting. How does it fare on different distributions of data - for instance, strings consisting exclusively of ASCII digits and punctuation eg "01:12:35,726 --> 01:12:36,810", or strings consisting primarily of ASCII but with occasional BMP or astral characters, or strings primarily of Cyrillic text, etc? What if every single string begins with a slash character (eg if you're sorting a bunch of path names)? At what list size does it become visibly faster? Could this be put onto PyPI and then benchmarked as lst.sort() vs flagsort(lst) ? ChrisA
- Previous message (by thread): [Python-Dev] Drastically improving list.sort() for lists of strings/ints
- Next message (by thread): [Python-Dev] Drastically improving list.sort() for lists of strings/ints
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list