What is Python?
Donn Cave
donn at u.washington.edu
Wed Sep 20 15:54:29 EDT 2000
More information about the Python-list mailing list
Wed Sep 20 15:54:29 EDT 2000
- Previous message (by thread): What is Python?
- Next message (by thread): What is Python?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Quoth Andrew Kuchling <akuchlin at mems-exchange.org>: [ ... re regular expressions considered harmful ] | Because people too often apply them to inappropriate tasks; the last | example I can recall was someone in c.l.p who was trying to use | regexes to filter out files with 'SCCS' in the path and .java at the | end. The regex to do this is not easy to write and not clear. Part | of the problem is that, like Prolog, you really need to understand the | underlying implementation to write regexes properly. Making regexes | purely declarative might fix this, but even there .* behaves | counterintuitively. An easy way of parsing text has not yet been | found, I think. Amen, sort of! I have said it before, the easiest text parsing I ever saw was the "PARSE" statement in REXX. A language I don't know well, but that's the point, you don't have to know anything to use this "parse by example" system. Now it would probably only make a REXX programmer cry, but Aaron Watters wrote a tparse module that works kind of the same way and has some extra features. Cf ftp://ftp.python.org/pub/www.python.org/ftp/python/contrib-09-Dec-1999/DataStructures/tparsing.py For an example, I wrote the following up in a few minutes. It analyzes a syslog log file, and reports successful and unsuccessful Kerberos authentication attempts and reasons for failure. I probably made it more complicated than illustrative by wrapping the parse templates in my own class; tparse raises a ValueError on no match, which is the right thing for it to do but made for an awkward loop full of try/except blocks, and that's why the wrapper. The clearer way to call the PARSE function is like ((x1, x2, value, x4), chars) = template.PARSE(data) (if the template is like */*<*>* and you want the <*> part.) Donn Cave, donn at u.washington.edu ---------------------------------------- import sys from tparsing import Template class T: def __init__(self, template, av = None): self.template = Template(template, '*') self.av = av def parse(self, data): try: result, chars = self.template.PARSE(data) i = 0 if self.av: for a in self.av: if a is None: pass else: setattr(self, a, result[i]) i = i + 1 return result except ValueError: return None success = T('* authtime *, *@*', (None, None, 'user')) fail = T('* PREAUTH_FAILED: *@*', (None, 'user')) nopa = T('* NEEDED_PREAUTH: *@*', (None, 'user')) paverify = T('* preauth (*) verify failure: *\n', (None, 'patype', 'error')) palog = None while 1: s = sys.stdin.readline() if not s: break if success.parse(s): print repr(success.user), 'OK' elif fail.parse(s): if palog: print repr(fail.user), palog palog = None else: print repr(fail.user), '?' elif nopa.parse(s): print repr(nopa.user), '(needed preauth)' elif paverify.parse(s): palog = '%s (%s)' % (paverify.error, paverify.patype) else: print "here's an odd one:", s,
- Previous message (by thread): What is Python?
- Next message (by thread): What is Python?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list