Issue522898
Created on 2002-02-26 10:40 by cmalamas, last changed 2022-04-10 16:05 by admin. This issue is now closed.
| Messages (2) | |||
|---|---|---|---|
| msg9423 - (view) | Author: Costas Malamas (cmalamas) | Date: 2002-02-26 10:40 | |
The robotparser module handles incorrectly empty paths in the allow/disallow directives. According to: http://www.robotstxt.org/wc/norobots- rfc.html, the following rule should be a global *allow*: User-agent: * Disallow: My reading of the RFC is that an empty path is always a global allow (for both Allow and Disallow directives) so that the syntax is backwards compatible --there was no Allow directive in the original syntax. Suggested fix: robotparser.RuleLine.applies_to() becomes: def applies_to(self, filename): if not self.path: self.allowance = 1 return self.path=="*" or re.match(self.path, filename) |
|||
| msg9424 - (view) | Author: Martin v. Löwis (loewis) * ![]() |
Date: 2002-02-28 15:32 | |
Logged In: YES user_id=21627 This is fixed in robotparser.py 1.11. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-10 16:05:02 | admin | set | github: 36161 |
| 2002-02-26 10:40:51 | cmalamas | create | |
