Issue8672
Created on 2010-05-09 22:44 by matthew.brett, last changed 2022-04-11 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| mat.bin | matthew.brett, 2010-05-09 22:44 | binary zlib-compressed data causing decompression error | ||
| zlib-8672.patch | pitrou, 2010-05-10 23:00 | |||
| Messages (9) | |||
|---|---|---|---|
| msg105420 - (view) | Author: Matthew Brett (matthew.brett) | Date: 2010-05-09 22:44 | |
I have a valid zlib compressed string, attached here as 'mat.bin' (1.7M), that cause and error on zlib.decompress decompression:
>>> import zlib
>>> data = open('mat.bin', 'rb').read()
>>> out = zlib.decompress(data)
Traceback (most recent call last):
File "<ipython console>", line 1, in <module>
error: Error -5 while decompressing data
I know these data are valid, because I get the string I was expecting with:
>>> dc_obj = zlib.decompressobj()
>>> out = dc_obj.decompress(data)
As expected, there is no remaining data after this read:
>>> assert dc_obj.flush() == ''
>>>
I believe that the behavior of zlib.decompress(data) and zlib.decompressobj().decompress(data) should be equivalent, and that the error for zlib.decompress(data) is therefore the symptom of a bug.
|
|||
| msg105470 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-05-10 22:11 | |
After a bit of debugging, it seems your data is not actually a complete zlib stream (*). What did you generate it with? (*) in technical terms, the zlib never returns Z_STREAM_END when decompressing your data. The decompressobj ignores it, but the top-level decompress() function considers it an error. |
|||
| msg105474 - (view) | Author: Matthew Brett (matthew.brett) | Date: 2010-05-10 22:30 | |
Hi, > Antoine Pitrou <pitrou@free.fr> added the comment: > > After a bit of debugging, it seems your data is not actually a complete zlib stream (*). What did you generate it with? > > (*) in technical terms, the zlib never returns Z_STREAM_END when decompressing your data. The decompressobj ignores it, but the top-level decompress() function considers it an error. Thanks for the debugging. The stream comes from within a matlab 'mat' file. I maintain the scipy matlab file readers; the variables within these files are zlib compressed streams. Is there (should there be) a safe and maintained way to allow me to read a stream that does not return Z_STREAM_END? |
|||
| msg105475 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-05-10 22:36 | |
> Thanks for the debugging. The stream comes from within a matlab 'mat' > file. I maintain the scipy matlab file readers; the variables within > these files are zlib compressed streams. So this would be a Matlab issue, right? > Is there (should there be) a safe and maintained way to allow me to > read a stream that does not return Z_STREAM_END? Decompressor objects allow you to do that, but I cannot tell you how "maintained" it is. If it has to be maintained, we could add an unit test for it so that regressions get detected. It would be nice if you could provide a very short zlib stream reproducing the issue. |
|||
| msg105477 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-05-10 22:39 | |
I also think we should improve the zlib module's error messages. I've added a patch in issue8681 for that. With that patch, the message you'd've encountered would have been "Error -5 while decompressing data: incomplete or truncated stream", which is quite more informative. |
|||
| msg105478 - (view) | Author: Matthew Brett (matthew.brett) | Date: 2010-05-10 22:48 | |
>> Thanks for the debugging. The stream comes from within a matlab 'mat' >> file. I maintain the scipy matlab file readers; the variables within >> these files are zlib compressed streams. > > So this would be a Matlab issue, right? Yes, except scipy and numpy aim in part to be an open-source replacement for matlab, so we very much want to be able to read their files. >> Is there (should there be) a safe and maintained way to allow me to >> read a stream that does not return Z_STREAM_END? > > Decompressor objects allow you to do that, but I cannot tell you how > "maintained" it is. If it has to be maintained, we could add an unit > test for it so that regressions get detected. It would be nice if you > could provide a very short zlib stream reproducing the issue This is the only .mat file stream I have yet come across that causes the error. It is possible to knock a portion off the end of a valid stream to reproduce the problem? |
|||
| msg105480 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-05-10 23:00 | |
Ok, it turned out to be quite easy indeed. Here is a patch adding a test. |
|||
| msg105544 - (view) | Author: Gregory P. Smith (gregory.p.smith) * ![]() |
Date: 2010-05-11 21:00 | |
patch looks good. |
|||
| msg105558 - (view) | Author: Antoine Pitrou (pitrou) * ![]() |
Date: 2010-05-11 23:39 | |
The patch was committed in r81094 (2.7), r81095 (2.6), r81096 (3.2) and r81097 (3.1). Thank you! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:57:00 | admin | set | github: 52918 |
| 2010-05-11 23:39:18 | pitrou | set | status: open -> closed resolution: fixed messages: + msg105558 stage: patch review -> resolved |
| 2010-05-11 21:00:23 | gregory.p.smith | set | messages: + msg105544 |
| 2010-05-10 23:00:44 | pitrou | set | nosy:
gregory.p.smith, pitrou, matthew.brett components: + Tests, - Library (Lib) stage: needs patch -> patch review |
| 2010-05-10 23:00:32 | pitrou | set | files:
+ zlib-8672.patch keywords: + patch messages: + msg105480 |
| 2010-05-10 22:48:39 | matthew.brett | set | messages: + msg105478 |
| 2010-05-10 22:39:50 | pitrou | set | nosy:
+ gregory.p.smith messages: + msg105477 |
| 2010-05-10 22:36:06 | pitrou | set | messages: + msg105475 |
| 2010-05-10 22:30:56 | matthew.brett | set | messages: + msg105474 |
| 2010-05-10 22:11:01 | pitrou | set | nosy:
+ pitrou messages: + msg105470 |
| 2010-05-09 22:49:10 | pitrou | set | stage: needs patch components: + Library (Lib), - IO versions: + Python 2.7, Python 3.2 |
| 2010-05-09 22:44:04 | matthew.brett | create | |
