Web-crawling
John J. Lee
jjl at pobox.com
Sat Oct 4 12:26:31 EDT 2003
More information about the Python-list mailing list
Sat Oct 4 12:26:31 EDT 2003
- Previous message (by thread): python2.2: signals and exceptions: interrupted system call
- Next message (by thread): Web-crawling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
"John Bradbury" <john_bradbury at ___cableinet.co.uk> writes: > "Rene Pijlman" <reply.in.the.newsgroup at my.address.is.invalid> wrote in > message news:bretnvcng69nqpoeug71jon4obs0moe63f at 4ax.com... > > John Bradbury: > > >I am trying to develop a special putpose crawler using htmllib & urllib. > > >How do you tell the server application that you are a modern browser > > >and can handle frames? [...] > > server would care, but you could mimic the User-agent header sent by a [...] > I don't know what is causing the problem, but the site I am accessing is > sending out forms for a browser that has a low resolution and does not > support frames. Excuse my ignorance, but where do you set up the > User-agent header you suggested. For urllib2 (well, almost): http://wwwsearch.sourceforge.net/ClientCookie/doc.html#headers John
- Previous message (by thread): python2.2: signals and exceptions: interrupted system call
- Next message (by thread): Web-crawling
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list