This is a follow-on from Issue 15955, which has added low-level support for limiting the amount of data from the LZMA and bzip decompressors. The high-level LZMAFile and BZ2File reader APIs need to make use of the new low level max_length parameter.
I am starting off with a patch for LZMAFile, based on a patch I posted to Issue 15955. I split out a _RawReader class that does the actual decompress() calls, and then wrapped that in a BufferedReader. The LZMAFile then just delegates the read methods to the BufferedReader. This avoids needing any special code to implement buffering, readline(), etc. This involved some changes in the API though:
* LZMAFile now uses BufferedReader.peek(). The current implementation seems appropriate, but I am not comfortable with the current specification in the documentation, which says it is allowed to not return any useful data. See Issue 5811.
* read() now accepts size=None, because BufferedReader does. I had to change a test case for this.
* BufferedReader.seek() raises a different exception for invalid “whence”
Once the LZMAFile patch sees a bit of review and looks like it might be acceptable, I plan to change the BZ2File class in a similar manner. |