After spending a sufficient amount of time looking at patches and the RFC 2732, I tend to agree with the patch provided by tlocke. It does cover the behavior for parsing IPv6 URL with '[' hostname ']'. RFC 2732 is very short and just says that hostname in the IPv6 should not have '[' and ']' characters. The patch does just that, which is fine.
If hard pressed on detecting invalid IPv6 , I would add and extra
+ if "[" in netloc and "]" in netloc:
+ return netloc.split("]")[0][1:].lower()
+ elif "[" in netloc or "]" in netloc:
+ raise ValueError("Invalid IPv6 URL")
Which should take care of Invalid IPv6 urls as discussed in this bug.
- Any comments on this?
Also regarding the urlparse header docs, (it was long pending on me and sorry), here is a patch for current one for review. When we address this bug, I shall include RFC 2732 as well in the list. |