Warning about "for line in file:"
Aldo Cortesi
aldo at nullcube.com
Sat Feb 16 08:44:44 EST 2002
More information about the Python-list mailing list
Sat Feb 16 08:44:44 EST 2002
- Previous message (by thread): Warning about "for line in file:"
- Next message (by thread): Warning about "for line in file:"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Thus spake Brian Kelley (bkelley at wi.mit.edu): > Neil Schemenauer wrote: > > >Russell E. Owen wrote: > > > >>The readline or xreadline file methods work fine, of course. > >> > > > >Why "of course"? iter(file) does the same thing as file.xreadlines(). > >Have you tested xreadlines? > > > > Neil > > > > > > I had the same problem with xreadlines but "for line in > file" is MUCH less explicit and leads to erros like this. > > file = open(...) > > count = 0 > for line in file: > if count > 10: break > print line > count = count + 1 > > for line in file: > print line > > Doesn't work like I would expect. This is essentially > doing the following: > > file = open(...) > > count = 0 > for line in file.xreadlines(): > if count > 10: break > print line > count = count + 1 > > for line in file.xreadlines(): > print line > > So what is REALLY happening is that you are creating two > seperate iterators in the above examples. Writing "for > line in file" instead of "for line in file.xreadlines()" > simply hides and confuses this. > > The problem with spawning multiple iterators is that their > is a read cache going on behind the scenes and > file.xreadlines() doesn't rewind the file to the starting > point. Actually this has nothing to do with iterators, or a "read cache". iter(file) creates a line iterator that does the same thing as file.readline() every time .next() is called, until it reaches the end of the file. But file.readline(), just like any other file read, starts reading at the _current seek position_ of the file. For instance, say we have a file with one digit per line, like this: 1 2 3 4 5 On your machine it may differ, but on my machine this file is exactly 10 characters long - each digit is followed by a line feed. If we now do: file = open("file") print file.tell() for i in file: pass print file.tell() We see that the file position started at character 0, and ended at character 10. Another attempt to read from the file will produce nothing. However, if we now do: file.seek(6) then... for i in file: print i, ... we get: 4 5 In a nutshell, a file read will start at the file offset, which can be found by going file.tell(), and set by going file.seek(). This is the case wether you use iterators, xreadlines(), readlines(), or just plain read()... Cheers, Aldo -- Aldo Cortesi aldo at nullcube.com www.nullcube.com
- Previous message (by thread): Warning about "for line in file:"
- Next message (by thread): Warning about "for line in file:"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list