bpo-39040: Fix parsing of email mime headers with whitespace between encoded-words. by maxking · Pull Request #17620 · python/cpython

@maxking

…ed-words.

In certain malformed content-disposition headers, parameter values are quoted
and split as encoded words on two lines with extra whitespaces. This fixes the
issue by removing the extra whitespace between the two encoded words.

bitdancer

bitdancer

bitdancer

bitdancer

bitdancer

@maxking maxking changed the title bpo-39040: Fix parsing of email headers with whitespace between encoded-words. bpo-39040: Fix parsing of email mime headers with whitespace between encoded-words.

Dec 17, 2019

@maxking

bitdancer

@maxking

@maxking

bitdancer

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

May 29, 2020
…encoded-words. (pythongh-17620)

* bpo-39040: Fix parsing of email headers with encoded-words inside a quoted string.

It is fairly common to find malformed mime headers (especially content-disposition
headers) where the parameter values, instead of being encoded to RFC
standards, are "encoded" by doing RFC 2047 "encoded word" encoding, and
then enclosing the whole thing in quotes.  The processing of these malformed
headers was incorrectly leaving the spaces between encoded words in the decoded
text (whitespace between adjacent encoded words is supposed to be stripped on
decoding).  This changeset fixes the encoded word processing inside quoted strings
(bare-quoted-string) to do correct RFC 2047 decoding by stripping that
whitespace.
(cherry picked from commit 21017ed)

Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request

May 29, 2020
…encoded-words. (pythongh-17620)

* bpo-39040: Fix parsing of email headers with encoded-words inside a quoted string.

It is fairly common to find malformed mime headers (especially content-disposition
headers) where the parameter values, instead of being encoded to RFC
standards, are "encoded" by doing RFC 2047 "encoded word" encoding, and
then enclosing the whole thing in quotes.  The processing of these malformed
headers was incorrectly leaving the spaces between encoded words in the decoded
text (whitespace between adjacent encoded words is supposed to be stripped on
decoding).  This changeset fixes the encoded word processing inside quoted strings
(bare-quoted-string) to do correct RFC 2047 decoding by stripping that
whitespace.
(cherry picked from commit 21017ed)

Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>

miss-islington added a commit that referenced this pull request

May 29, 2020
…encoded-words. (gh-17620)

* bpo-39040: Fix parsing of email headers with encoded-words inside a quoted string.

It is fairly common to find malformed mime headers (especially content-disposition
headers) where the parameter values, instead of being encoded to RFC
standards, are "encoded" by doing RFC 2047 "encoded word" encoding, and
then enclosing the whole thing in quotes.  The processing of these malformed
headers was incorrectly leaving the spaces between encoded words in the decoded
text (whitespace between adjacent encoded words is supposed to be stripped on
decoding).  This changeset fixes the encoded word processing inside quoted strings
(bare-quoted-string) to do correct RFC 2047 decoding by stripping that
whitespace.
(cherry picked from commit 21017ed)

Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>

miss-islington added a commit that referenced this pull request

May 29, 2020
…encoded-words. (gh-17620)

* bpo-39040: Fix parsing of email headers with encoded-words inside a quoted string.

It is fairly common to find malformed mime headers (especially content-disposition
headers) where the parameter values, instead of being encoded to RFC
standards, are "encoded" by doing RFC 2047 "encoded word" encoding, and
then enclosing the whole thing in quotes.  The processing of these malformed
headers was incorrectly leaving the spaces between encoded words in the decoded
text (whitespace between adjacent encoded words is supposed to be stripped on
decoding).  This changeset fixes the encoded word processing inside quoted strings
(bare-quoted-string) to do correct RFC 2047 decoding by stripping that
whitespace.
(cherry picked from commit 21017ed)

Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>

miss-islington added a commit that referenced this pull request

May 29, 2020
…encoded-words. (gh-17620)

* bpo-39040: Fix parsing of email headers with encoded-words inside a quoted string.

It is fairly common to find malformed mime headers (especially content-disposition
headers) where the parameter values, instead of being encoded to RFC
standards, are "encoded" by doing RFC 2047 "encoded word" encoding, and
then enclosing the whole thing in quotes.  The processing of these malformed
headers was incorrectly leaving the spaces between encoded words in the decoded
text (whitespace between adjacent encoded words is supposed to be stripped on
decoding).  This changeset fixes the encoded word processing inside quoted strings
(bare-quoted-string) to do correct RFC 2047 decoding by stripping that
whitespace.
(cherry picked from commit 21017ed)

Co-authored-by: Abhilash Raj <maxking@users.noreply.github.com>

CuriousLearner added a commit to CuriousLearner/cpython that referenced this pull request

May 30, 2020
* 'master' of github.com:python/cpython: (497 commits)
  bpo-40061: Fix a possible refleak in _asynciomodule.c (pythonGH-19748)
  bpo-40798: Generate a different message for already removed elements (pythonGH-20483)
  closes bpo-29017: Update the bindings for Qt information with PySide2 (pythonGH-20149)
  bpo-39885: Make IDLE context menu cut and copy work again (pythonGH-18951)
  bpo-29882: Add an efficient popcount method for integers (python#771)
  Further de-linting of zoneinfo module (python#20499)
  bpo-40780: Fix failure of _Py_dg_dtoa to remove trailing zeros (pythonGH-20435)
  Indicate that abs() method accept argument that implement __abs__(), just like call() method in the docs (pythonGH-20509)
  bpo-39040: Fix parsing of email mime headers with whitespace between encoded-words. (pythongh-17620)
  bpo-40784: Fix sqlite3 deterministic test (pythonGH-20448)
  bpo-30064: Properly skip unstable loop.sock_connect() racing test (pythonGH-20494)
  Note the output ordering of combinatoric functions (pythonGH-19732)
  bpo-40474: Updated coverage.yml to better report coverage stats (python#19851)
  bpo-40806: Clarify that itertools.product immediately consumes its inpt (pythonGH-20492)
  bpo-1294959: Try to clarify the meaning of platlibdir (pythonGH-20332)
  bpo-37878: PyThreadState_DeleteCurrent() was not removed (pythonGH-20489)
  bpo-40777: Initialize PyDateTime_IsoCalendarDateType.tp_base at run-time (pythonGH-20493)
  bpo-40755: Add missing multiset operations to Counter() (pythonGH-20339)
  bpo-25920: Remove socket.getaddrinfo() lock on macOS (pythonGH-20177)
  bpo-40275: Fix test.support.threading_helper (pythonGH-20488)
  ...