Issue 20270: urllib.parse doesn't work with empty port
According to RFC 3986 the port subcomponent is defined as zero or more decimal digits delimited from the host by a single colon. I.e. 'python.org:' is valid (but not normalized) form. Empty port is equivalent to absent port.
>>> import urllib.parse
>>> p = urllib.parse.urlparse('http://python.org:')
>>> p.hostname
'python.org'
>>> p.port # should return None
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/serhiy/py/cpython-3.3/Lib/urllib/parse.py", line 156, in port
port = int(port, 10)
ValueError: invalid literal for int() with base 10: ''
>>> urllib.parse.splitport('python.org:') # should return ('python.org', None)
('python.org:', None)
>>> urllib.parse.splitnport('python.org:') # should return ('python.org', -1)
('python.org', None)
>>> urllib.parse.splitnport('python.org:', 80) # should return ('python.org', 80)
('python.org', None)
Proposed patch fixes this. It also adds tests for urllib.parse.splitport().