> I would rather propose to simplify the needle heuristic and only use it
> when the lower byte is non-zero. A properly optimized memchr() (as in
> the glibc / gcc) is definitely faster than our naïve loop.
That would be fine as well. Not sure if a heuristics would be needed in
this case at all: it's probably uncommon that you search for a single
character whose lower-half is 0 (most likely you are then searching for
the null character, and not, say, LATIN CAPITAL LETTER A WITH DOUBLE
GRAVE).
In any case, I still think that the heuristics (if any) needs to be
explained better, and needs some justification in the first place. |