efficient 'tail' implementation
Mike Meyer
mwm at mired.org
Thu Dec 8 02:09:58 EST 2005
More information about the Python-list mailing list
Thu Dec 8 02:09:58 EST 2005
- Previous message (by thread): efficient 'tail' implementation
- Next message (by thread): efficient 'tail' implementation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
s99999999s2003 at yahoo.com writes: > I have a file which is very large eg over 200Mb , and i am going to use > python to code a "tail" > command to get the last few lines of the file. What is a good algorithm > for this type of task in python for very big files? > Initially, i thought of reading everything into an array from the file > and just get the last few elements (lines) but since it's a very big > file, don't think is efficient. Well, 200mb isn't all that big these days. But it's easy to code: # untested code input = open(filename) tail = input.readlines()[:tailcount] input.close() and you're done. However, it will go through a lot of memory. Fastest is probably working through it backwards, but that may take multiple tries to get everything you want: # untested code input = open(filename) blocksize = tailcount * expected_line_length tail = [] while len(tail) < tailcount: input.seek(-blocksize, EOF) tail = input.read().split('\n') blocksize *= 2 input.close() tail = tail[:tailcount] It would probably be more efficient to read blocks backwards and paste them together, but I'm not going to get into that. <mike -- Mike Meyer <mwm at mired.org> http://www.mired.org/home/mwm/ Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.
- Previous message (by thread): efficient 'tail' implementation
- Next message (by thread): efficient 'tail' implementation
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list