Please help... with re
Alex Martelli
alex at magenta.com
Wed Jul 26 17:43:43 EDT 2000
More information about the Python-list mailing list
Wed Jul 26 17:43:43 EDT 2000
- Previous message (by thread): Please help... with re
- Next message (by thread): Please help... with re
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"Olivier Dagenais" <olivierS.dagenaisP at canadaA.comM> wrote in message news:bzHf5.48352$1h3.670995 at news20.bellglobal.com... [snip] > A - stream your input character by character > B - when you encounter a space, add all "buffered" characters to the list > C - if you encounter a quote, ignore rule B until you hit another quote > D - if you hit a backslash, ignore rule C for the next character > E - once you run out of characters, add all "buffered" characters to the > list Nice, clean approach. Who knows about performance, since the re engine is coded in C while this FSM would be coded in Python, but worth giving it a try, I think. Here's a rather straightforward coding of it -- it can no doubt be coded more elegantly by making the FSM explicit; this version relies far too much on making its checks in a specific order, and on 'continue' statements to avoid too-deep nesting... still, here comes, coded off-the-cuff: def splitaline(line): result=[] curtok=[] insidequote=0 literalnext=0 for c in line: if literalnext: curtok.append(c) literalnext=0 continue if c=='\\': literalnext=1 continue if insidequote: if c=='"': result.append(string.join(curtok,'')) curtok=[] insidequote=0 else: curtok.append(c) continue if c=='"': insidequote=1 elif c==' ': result.append(string.join(curtok,'')) curtok=[] else: curtok.append(c) if len(curtok): result.append(string.join(curtok,'')) return result > I made an horrid 68 lines monster to split a string to a list of substrings Well, at least this halves it:-). > based on following example: > > This is an "example of a \"splitted\" text " by my monster. > > results to this list: > > [ 'This' , 'is' , 'an' , 'example of a "splitted" text ' , 'by' , 'my' , > 'monster' ] > > But the stuff is too slow to parse the lines of giant log files. Alex
- Previous message (by thread): Please help... with re
- Next message (by thread): Please help... with re
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list