File reading using delimiters
Noah
noah at noah.org
Mon Jun 9 20:32:38 EDT 2003
More information about the Python-list mailing list
Mon Jun 9 20:32:38 EDT 2003
- Previous message (by thread): File reading using delimiters
- Next message (by thread): File reading using delimiters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
kylotan at hotmail.com (Kylotan) wrote in message news:<153fa67.0306090752.218d23b1 at posting.google.com>... > All the examples of reading files in Python seem to concern reading a > line at a time. But this is not much good to me as I want to be able > to read up to arbitrary delimiters without worrying about how many > lines I'm spanning. With my rudimentary Python knowledge I'm having to > read in multiple lines, concatenate them, search for the delimiter, > split the result if necessary, and carry forward whatever was after > the delimiter to the next operation. Is there a better way of reading > until a certain character is encountered, and no more? My Pexpect module is good for this type of scanning. http://pexpect.sourceforge.net/ Your code might then look like the follwing examples. Pexpect can match single chars or arbitrary strings or regular expressions or lists of all of the above. Note that Pexpect works on file descriptors. It doesn't operate directly on file-like objects, so it wouldn't work on a StringIO object (maybe I'll add that feature to future versions). But true file objects have a file descriptor, so this should do what you want. Store this test data in a file called "my_happy_file": This is data for the first chunk FIRST_DELIMITER This is now data for the second chunk. Notice that this chunk can span multiple lines. The delimiter can also sepcified as a regular expression. The delimiter does not need to be on a separate line. SECOND_DELIMITER --- first exmaple ------------------------------------------------------------- import pexpect fin = file ('my_happy_file', 'r') reader = pexpect.spawn (fin.fileno()) # Uses the file descriptor of fin. reader.expect ('FIRST_DELIMITER') first_chunk = reader.before # everything before the expected delimiter. reader.expect ('SEC.*_DELIMITER') second_chunk = reader.before print first_chunk print second_chunk ------------------------------------------------------------------------------- Note that since you can look for a regular expression with subgroups that you could also match all your fields with one regular expression: --- Second exmaple ------------------------------------------------------------ import pexpect fin = file ('my_happy_file', 'r') reader = pexpect.spawn (fin.fileno()) reader.expect ('(.*)FIRST_DELIMITER(.*)SECOND_DELIMITER') print reader.match.group(1) print reader.match.group(2) ------------------------------------------------------------------------------- Yours, Noah
- Previous message (by thread): File reading using delimiters
- Next message (by thread): File reading using delimiters
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list