bpo-30228: TextIOWrapper uses abs_pos, not tell() by vstinner · Pull Request #1385 · python/cpython
Conversation
The TextIOWrapper constructor now gets directly the abs_pos attribute
of BufferedWriter and BufferedRandom instead of calling the tell()
method to avoid one lseek() syscall on open(fname, "w") and
open(fname, "w+").
Move the buffered structure to _iomodule.h and rename it to
_PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by
_PyIO_buffered lock.
The TextIOWrapper constructor now gets directly the abs_pos attribute of BufferedWriter and BufferedRandom instead of calling the tell()
This change means that TextIOWrapper becomes inconsistent if the file descriptor is moved direcly using os.lseek()... but BufferedReader/BufferedWriter don't detect neither when the file descriptor is moved directly, no? I mean, abs_pos cached attribute already has the bug, no?
@pitrou: Would you mind to review this one? Does it look like an acceptable optimization?
@serhiy-storchaka: Same questions. Would you mind to review this one? Does it look like an acceptable optimization?
@Haypo: this looks like an acceptable optimization, but the question is whether it brings any significant speedup.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if it's not seekable?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->seekable is set.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about BufferedReader?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only go into this path if self->encoder is set. self->encoder is only set if the buffer is writable.
@pitrou: I replied to your comments. So, what do you think? Is my PR safe?
As I said above : this looks like an acceptable optimization, but the question is whether it brings any significant speedup :-)
Of course I'm not against making optimizations in buffered I/O. I also know that buffering can be tricky (being responsible for the data corruption issue in Python 3.2 made me cautious about this!). But if this can increase performance significantly then ok (and you'll bear the responsability of any regression ;-)).
The TextIOWrapper constructor now gets directly the private abs_pos attribute of BufferedWriter and BufferedRandom instead of calling the tell() method to avoid one lseek() syscall on open(fname, "w") and open(fname, "w+"). Move the buffered structure to _iomodule.h and rename it to _PyIO_buffered. Add also "pythread.h" to _iomodule.h, needed by _PyIO_buffered lock.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters