[Python-Dev] Disabling string interning for null and single-char causes segfaults
Terry Reedy
tjreedy at udel.edu
Sat Mar 2 21:32:02 CET 2013
More information about the Python-Dev mailing list
Sat Mar 2 21:32:02 CET 2013
- Previous message: [Python-Dev] Disabling string interning for null and single-char causes segfaults
- Next message: [Python-Dev] Disabling string interning for null and single-char causes segfaults
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On 3/2/2013 10:08 AM, Nick Coghlan wrote: > On Sat, Mar 2, 2013 at 1:24 AM, Stefan Bucur <stefan.bucur at gmail.com> wrote: >> Hi, >> >> I'm working on an automated bug finding tool that I'm trying to apply on the >> Python interpreter code (version 2.7.3). Because of early prototype >> limitations, I needed to disable string interning in stringobject.c. More >> precisely, I modified the PyString_FromStringAndSize and PyString_FromString >> to no longer check for the null and single-char cases, and create instead a >> new string every time (I can send the patch if needed). >> >> However, after applying this modification, when running "make test" I get a >> segfault in the test___all__ test case. >> >> Before digging deeper into the issue, I wanted to ask here if there are any >> implicit assumptions about string identity and interning throughout the >> interpreter implementation. For instance, are two single-char strings having >> the same content supposed to be identical objects? >> >> I'm assuming that it's either this, or some refcount bug in the interpreter >> that manifests only when certain strings are no longer interned and thus >> have a higher chance to get low refcount values. > > In theory, interning is supposed to be a pure optimisation, but it > wouldn't surprise me if there are cases that assume the described > strings are always interned (especially the null string case). Our > test suite would never detect such bugs, as we never disable the > interning. Since it required patching functions rather than a configuration switch, it literally seems not be a supported option. If so, I would not consider it a bug for CPython to use the assumption of interning to run faster and I don't think it should be slowed down if that would be necessary to remove the assumption. (This is all assuming that the problem is not just a ref count bug.) Stefan's question was about 2.7. I am just curious: does 3.3 still intern (some) unicode chars? Did the 256 interned bytes of 2.x carry over to 3.x? > Whether or not we're interested in fixing such bugs would depend on > the size of the patches needed to address them. From our point of > view, such bugs are purely theoretical (as the assumption is always > valid in an unpatched CPython build), so if the problem is too hard to > diagnose or fix, we're more likely to declare that interning of at > least those kinds of string values is required for correctness when > creating modified versions of CPython. -- Terry Jan Reedy
- Previous message: [Python-Dev] Disabling string interning for null and single-char causes segfaults
- Next message: [Python-Dev] Disabling string interning for null and single-char causes segfaults
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-Dev mailing list