zipfile.Path regression
Bug report
Bug description:
#122906 introduced a regression with directories that look like Windows drive letters (on Linux):
>>> import io, zipfile >>> zf = zipfile.ZipFile(io.BytesIO(), "w") >>> zf.writestr("d:/foo", "bar") >>> zf.extractall("a") >>> open("a/d:/foo").read() 'bar' >>> p = zipfile.Path(zf) >>> x = p / "d" / "foo" >>> y = p / "d:" / "foo" >>> list(p.iterdir()) # before: [Path(None, 'd:/')] [Path(None, 'd/')] >>> p.root.namelist() # before: ['d:/foo', 'd:/'] ['d/foo', 'd/'] >>> x.exists() # before: False True >>> y.exists() # before: True False >>> zf.extractall("b") # before: worked like above KeyError: "There is no item named 'd/foo' in the archive" >>> x.read_text() # before: FileNotFoundError KeyError: "There is no item named 'd/foo' in the archive" >>> y.read_text() # before: worked FileNotFoundError: ...
This is the result of _sanitize() unconditionally treating a directory that looks like a drive letter as such and removing the colon, regardless of operating system:
| bare = re.sub('^([A-Z]):', r'\1', name, flags=re.IGNORECASE) |
Whereas _extract_member() uses os.path.splitdrive() (which is a no-op on Linux):
| arcname = os.path.splitdrive(arcname)[1] |
CPython versions tested on:
3.12
Operating systems tested on:
Linux
Linked PRs
- gh-123270: Replaced SanitizedNames with a more surgical fix. #123354
- [3.13] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123410
- [3.12] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123411
- [3.11] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123425
- [3.10] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123426
- [3.9] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123432
- [3.8] gh-123270: Replaced SanitizedNames with a more surgical fix. (GH-123354) #123433