Large File Parsing
Paul Rubin
http
Mon Jun 16 00:13:51 EDT 2003
More information about the Python-list mailing list
Mon Jun 16 00:13:51 EDT 2003
- Previous message (by thread): Large File Parsing
- Next message (by thread): Large File Parsing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Robert S Shaffer <r.shaffer9 at verizon.net> writes: > I have upto a 3 million record file to parse, remove duplicates and > sort by size then numeric value. Is this the best way to do this in > python. The key is the first column and the ,xx needs removed. Your script is reasonable if you have enough memory to run it over your input files. If not, simplest is probably to filter 1234567,12 123456789012,12 into 10,1234567,12 15,123456789012,12 where the leading number you prepend is the length of the line. Then sort with the Unix sort utility (which does an external sort if the input is big enough to need it), then filter again to remove the prepended length.
- Previous message (by thread): Large File Parsing
- Next message (by thread): Large File Parsing
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list