mangled attempt at using htmllib
Ari Davidow
ari_deja at ivritype.com
Tue Oct 17 14:36:44 EDT 2000
More information about the Python-list mailing list
Tue Oct 17 14:36:44 EDT 2000
- Previous message (by thread): mangled attempt at using htmllib
- Next message (by thread): pyGTK -- why can't I center a window?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Wow! I got sick for a few days and missed this very, very useful tutorial. As it happens, my goals were slightly different that was apparent from the code: >> 200 OK <a href="urlstatusgo.html?col=test&url= / http%3A//www.foobar.com/archive/091400.html"> / http://www.foobar.com/archive/091400.html</a> > >Well, I think your first misapprehension is that you appear to be expecting HTTP back >form the urllib readlines() call, when in fact the HTTP is stripped off, and what *you* >see is just the HTML! I knew that this particular page would yield such lines. The idea was to evaluate each such line and grap the URL between the anchor_bgn and anchor_end, in the example shown, a simple http://www.foobar.com/archive/091400.html This might have been done more simply with regular expression, e.g., myUrl = re.search(r'<a href.*?>(.*?)</a>) because, as I seem to be discovering, the "handle_data" stuff in my parser class >> def handle_data(self, data): >> self.c_data=self.c_data+data doesn't refer to the data inside the anchor tag, which is what I wanted, but to something else (or, my current modules aren't asking for the right thing the right way, because printing the contents of self.c_data gives me "none" as a result. Anyway, just getting straight on idiosyncracies of htmllib and being reminded that cutting and pasting python code almost ALWAYS requires attention paid to spaces--tabs convert oddly, and the interpreter on my machine sees them as different, regardless of what they look like, has moved me forward in very nice, useful ways. Thank you! ari -- Ari Davidow ari at ivritype.com Sent via Deja.com http://www.deja.com/ Before you buy.
- Previous message (by thread): mangled attempt at using htmllib
- Next message (by thread): pyGTK -- why can't I center a window?
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list