Issue32698
Created on 2018-01-28 18:37 by Delgan, last changed 2022-04-11 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test.py | Delgan, 2018-01-28 18:37 | |||
| Messages (3) | |||
|---|---|---|---|
| msg310978 - (view) | Author: Delgan (Delgan) * | Date: 2018-01-28 18:37 | |
Hello.
The following code produces a improper compressed "test.txt.gzip" file:
import gzip
import shutil
input_path = "test.txt"
output_path = input_path + ".gzip"
with open(input_path, 'w') as file:
file.write("abc" * 10)
with gzip.open(output_path, 'wb') as f_out:
with open(input_path, 'rb') as f_in:
shutil.copyfileobj(f_in, f_out)
Although the content can be read correctly using `gzip.open(outputh_path, 'rb')`, it cannot be correctly opened using software like 7-Zip or WinRar.
If I open the "test.txt.gzip" file, it contains another "test.txt.gzip" file. If I change the code to use ".gz" extension and then open "test.txt.gz", it contains the expected "test.txt" file.
The contained "test.txt.gzip" is actually the same (at bytes level) that "test.txt", just the filename differs which causes tools like 7-Zip to mess up.
The bug is not present using compressions functions from "bz2" and "lzma" modules. I can use custom extension, it still can be (un)compressed without issue.
As to why I need to use an extension differents from ".gz": I would like to compress arbitrary ".tar" file given in input to ".tgz". I wish the user could open the file in his favorite software archiver and see that it contains a ".tar" file, rather than he does not understand why it contains the same ".tgz" file.
|
|||
| msg311018 - (view) | Author: Martin Panter (martin.panter) * ![]() |
Date: 2018-01-28 22:27 | |
According to the documentation, you can use the lower-level GzipFile constructor’s “filename” argument:
>>> with open(output_path, 'wb') as f_out, \
... gzip.GzipFile(fileobj=f_out, mode='wb', filename=input_path) as f_out, \
... open(input_path, 'rb') as f_in:
... shutil.copyfileobj(f_in, f_out)
...
>>> import os
>>> os.system("7z l test.txt.gzip")
[. . .]
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2018-01-28 22:23:16 ..... 30 34 test.txt
------------------- ----- ------------ ------------ ------------------------
|
|||
| msg311160 - (view) | Author: Delgan (Delgan) * | Date: 2018-01-29 19:33 | |
Thanks @martin.panter for your response. I will close this issue as "not a bug" as there is a workaround and as the current behavior could be deduced by reading carefully the entire documentation. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:58:57 | admin | set | github: 76879 |
| 2018-01-29 19:33:55 | Delgan | set | status: open -> closed resolution: not a bug messages: + msg311160 stage: resolved |
| 2018-01-28 22:27:15 | martin.panter | set | nosy:
+ martin.panter messages: + msg311018 |
| 2018-01-28 18:37:33 | Delgan | create | |
