Issue42627
Created on 2020-12-12 20:27 by benrg, last changed 2021-02-23 12:07 by corona10.
| Messages (2) | |||
|---|---|---|---|
| msg382921 - (view) | Author: (benrg) | Date: 2020-12-12 20:27 | |
If `HKCU\Software\Microsoft\Windows\CurrentVersion\Internet Settings\ProxyServer` contains the string `http=host:123;https=host:456;ftp=host:789`, then getproxies_registry() should return
{'http': 'http://host:123', 'https': 'http://host:456', 'ftp': 'http://host:789'}
for consistency with WinInet and Chromium, but it actually returns
{'http': 'http://host:123', 'https': 'https://host:456', 'ftp': 'ftp://host:789'}
This bug has existed for a very long time (since Python 2.0.1 if not earlier), but it was exposed recently when urllib3 added support for HTTPS-in-HTTPS proxies in version 1.26. Before that, an `https` prefix on the HTTPS proxy url was silently treated as `http`, accidentally resulting in the correct behavior.
There are additional bugs in the treatment of single-proxy strings (the case when the string contains no `=` character).
The Chromium code for parsing the ProxyServer string can be found here: https://source.chromium.org/chromium/chromium/src/+/refs/tags/89.0.4353.1:net/proxy_resolution/proxy_config.cc;l=86
Below is my attempt at modifying the code from `getproxies_registry` to approximately match Chromium's behavior. I could turn this into a patch, but I'd like feedback on the corner cases first.
if '=' not in proxyServer and ';' not in proxyServer:
# Use one setting for all protocols.
# Chromium treats this as a separate category, and some software
# uses the ALL_PROXY environment variable for a similar purpose,
# so arguably this should be 'all={}'.format(proxyServer),
# but this is more backward compatible.
proxyServer = 'http={0};https={0};ftp={0}'.format(proxyServer)
for p in proxyServer.split(';'):
# Chromium and WinInet are inconsistent in their treatment of
# invalid strings with the wrong number of = characters. It
# probably doesn't matter.
protocol, addresses = p.split('=', 1)
protocol = protocol.strip()
# Chromium supports more than one proxy per protocol. I don't
# know how many clients support the same, but handling it is at
# least no worse than leaving the commas uninterpreted.
for address in addresses.split(','):
if protocol in {'http', 'https', 'ftp', 'socks'}:
# See if address has a type:// prefix
if not re.match('(?:[^/:]+)://', address):
if protocol == 'socks':
# Chromium notes that the correct protocol here
# is SOCKS4, but "socks://" is interpreted
# as SOCKS5 elsewhere. I don't know whether
# prepending socks4:// here would break code.
address = 'socks://' + address
else:
address = 'http://' + address
# A string like 'http=foo;http=bar' will produce a
# comma-separated list, while previously 'bar' would
# override 'foo'. That could potentially break something.
if protocol not in proxies:
proxies[protocol] = address
else:
proxies[protocol] += ',' + address
|
|||
| msg387468 - (view) | Author: εθι Έι ― (kotori) | Date: 2021-02-21 17:49 | |
I came across this issue as well. I checked Microsoft documentations and it seems `InternetGetProxyInfo` in WinInet is deprecated, while `WinHttpGetIEProxyConfigForCurrentUser` in WinHTTP will return the exact same string as what it stored registery. https://docs.microsoft.com/en-US/troubleshoot/windows-client/networking/configure-client-proxy-server-settings-by-registry-file Also from this documentation, a proxy server could have "http://" prefix, so I guess it could also support "https://" prefix if a user set a https proxy. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2021-02-23 12:07:33 | corona10 | set | nosy:
+ corona10 |
| 2021-02-22 19:50:17 | kotori | set | components: + Library (Lib) |
| 2021-02-22 19:49:53 | kotori | set | components: + Windows, - Library (Lib) |
| 2021-02-21 17:49:09 | kotori | set | nosy:
+ kotori messages: + msg387468 components: - Windows |
| 2020-12-12 20:27:52 | benrg | create | |