Comparing strings from the back?

Dwight Hutto dwightdhutto at gmail.com
Mon Sep 3 22:06:44 EDT 2012

Previous message (by thread): Comparing strings from the back?
Next message (by thread): Comparing strings from the back?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Sep 3, 2012 at 9:54 PM, Roy Smith <roy at panix.com> wrote:

> There's been a bunch of threads lately about string implementations, and
> that got me t
>
> On
> hinking (which is often a dangerous thing).
>
> Let's assume you're testing two strings for equality.  You've already
> done the obvious quick tests (i.e they're the same length), and you're
> down to the O(n) part of comparing every character.
>
> I'm wondering if it might be faster to start at the ends of the strings
> instead of at the beginning?  If the strings are indeed equal, it's the
> same amount of work starting from either end.  But, if it turns out that
> for real-life situations, the ends of strings have more entropy than the
> beginnings, the odds are you'll discover that they're unequal quicker by
> starting at the end.
>
> It doesn't seem un-plausible that this is the case.  For example, most
> of the filenames I work with begin with "/home/roy/".  Most of the
> strings I use as memcache keys have one of a small number of prefixes.
> Most of the strings I use as logger names have common leading
> substrings.  Things like credit card and telephone numbers tend to have
> much more entropy in the trailing digits.
>
> On the other hand, hostnames (and thus email addresses) exhibit the
> opposite pattern.
>
> Anyway, it's just a thought.  Has anybody studied this for real-life
> usage patterns?
>
> I'm also not sure how this work with all the possible UCS/UTF encodings.
> With some of them, you may get the encoding semantics wrong if you don't
> start from the front.
> --
> http://mail.python.org/mailman/listinfo/python-list
>

First include len(string)/2, in order to include starting at the center of
the string, and threading/weaving by 2 processes out.

import timeit

 do the the rest, and see which has the fastest time.
-- 
Best Regards,
David Hutto
*CEO:* *http://www.hitwebdevelopment.com*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20120903/1f82f562/attachment.html>

Previous message (by thread): Comparing strings from the back?
Next message (by thread): Comparing strings from the back?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-list mailing list