[3.4] bpo-30500: urllib: Simplify splithost by calling into urlparse. (#1849) by vstinner · Pull Request #2291 · python/cpython

Conversation

@vstinner

The current regex based splitting produces a wrong result. For example::

http://abc#@def

Web browsers parse that URL as http://abc/#@def, that is, the host
is abc, the path is /, and the fragment is #@def.
(cherry picked from commit 90e01e5)

vstinner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner

@vstinner

I tested manually that "./python -m test test_urlparse" pass. Sadly, 3.4 has no pre-commit CI yet.

@larryhastings

I accepted a PR from Serhiy and now there's a conflict from Misc/NEWS. Do you mind changing it to NEWS.d?

@larryhastings

I'll accept this PR once you fix the conflicts.

The current regex based splitting produces a wrong result. For example::

  http://abc#@def

Web browsers parse that URL as ``http://abc/#@def``, that is, the host
is ``abc``, the path is ``/``, and the fragment is ``#@def``.
(cherry picked from commit 90e01e5)

@vstinner

Victor: "I tested manually that "./python -m test test_urlparse" pass. Sadly, 3.4 has no pre-commit CI yet."

I proposed PR #2475 to add CIs.

Larry: "I accepted a PR from Serhiy and now there's a conflict from Misc/NEWS. Do you mind changing it to NEWS.d?"

Sure, I created a NEWS.d entry and rebased my change.

@larryhastings

I'm willing to consider PR 2475 for 3.4, but we can discuss it over there.

Labels