WWW/urllib.urlretrieve problems
Oleg Broytmann
phd at emerald.netskate.ru
Wed Jul 14 10:03:28 EDT 1999
More information about the Python-list mailing list
Wed Jul 14 10:03:28 EDT 1999
- Previous message (by thread): WWW/urllib.urlretrieve problems
- Next message (by thread): Building Python static lib in WIN32
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hello! I ran an URL checker, based on urllib, and it reported some errors. I investigated what was going on, and I need some advices. First problem URL is http://www.expert.ru/ - it just timeouts. When I pointed Netscape to the address - the page appeared well. Lynx and telnet timed out too! Don't understand it. What in Netscape Communicator is so good? Second problem is with http://w3.one.net/~alward/. Netscape and lynx showed the page, urllib.urlretrieve() and telnet returned error403 - forbidden: ---------- Session ---------- phd at emerald 204 >>> t w3.one.net 80 Trying 206.112.192.125... Connected to w3.one.net. Escape character is '^]'. GEt /~alward/ HTTP/1.0 Host: w3.one.net HTTP/1.1 403 Forbidden Date: Wed, 14 Jul 1999 13:48:03 GMT Server: Apache/OneNet-W3 (LoadBal/2.1-s1) mod_perl/1.18 PHP/3.0.6 Connection: close Content-Type: text/html <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <HTML><HEAD> <TITLE>403 Forbidden</TITLE> </HEAD><BODY> <H1>Forbidden</H1> You don't have permission to access /~alward/ on this server.<P> <P>Additionally, a 403 Forbidden error was encountered while trying to use an ErrorDocument to handle the request. </BODY></HTML> Connection closed by foreign host. ---------- /Session ---------- It seems telnet and urllib need some additional HTTP/1.1 headers, but what headers? Third problem is on http://www.tucows.com/. Netscape and lynx showed the page, urllib returned exception: ---------- Session ---------- ('http error', -1, '<html>\012', None) Traceback (innermost last): File "./test.py", line 12, in ? fname, headers = urllib.urlretrieve(url) File "/usr/local/lib/python1.5/urllib.py", line 66, in urlretrieve return _urlopener.retrieve(url, filename, reporthook) File "/usr/local/lib/python1.5/urllib.py", line 184, in retrieve fp = self.open(url) File "/usr/local/lib/python1.5/urllib.py", line 157, in open return getattr(self, name)(url) File "/usr/local/lib/python1.5/urllib.py", line 272, in open_http return self.http_error(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python1.5/urllib.py", line 289, in http_error return self.http_error_default(url, fp, errcode, errmsg, headers) File "/usr/local/lib/python1.5/urllib.py", line 295, in http_error_default raise IOError, ('http error', errcode, errmsg, headers) IOError: ('http error', -1, '<html>\012', None) ---------- /Session ---------- Telnet session showed the server didn't return any HTTP header - just sent HTML. Should urllib test for this and how it should behave? I am not sure... Thanks in advance to anyone who is willing to discuss. Oleg. ---- Oleg Broytmann Netskate/Inter.Net.Ru phd at emerald.netskate.ru Programmers don't die, they just GOSUB without RETURN.
- Previous message (by thread): WWW/urllib.urlretrieve problems
- Next message (by thread): Building Python static lib in WIN32
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Python-list mailing list