bpo-29979: Rewrite cgi.parse_multipart to make it consistent with FieldStorage by PierreQuentel · Pull Request #991

bpo-29979: Rewrite cgi.parse_multipart to make it consistent with FieldStorage by PierreQuentel · Pull Request #991 · python/cpython

Pierre Quentel and others added 3 commits

April 14, 2017 13:52

Remove trailing whitespaces

Remove trailing whitespace

twm added a commit to twisted/twisted that referenced this pull request

Dec 16, 2019

bpo-29979's PR python/cpython#991 changed
cgi.parse_multipart() to require a 'CONTENT-LENGTH' member in its
`pdict` parameter (which is documented as "dictionary containing other
parameters of the content type header"). The PR seems to have some
issues:

* The change seems to conflate `pdict` with an environment dict,
  probably because the test passes a dict called `env` as `pdict`.
* 'CONTENT-LENGTH' isn't a customary parameter of the Content-Type
  header. `pdict` is intended to pass the multipart boundary [1].
* In a CGI environment the length of the request body is in the
  CONTENT_LENGTH variable, anyway [2]. Typo?
* FieldStorage doesn't require a Content-Length header anyway! It's
  potionally used to enforce cgi.maxlen, which limits the requests size
  (because this is the layer for that, right?).

#1012 deals with bpo-29979's
backward-incompatible change by passing the HTTP `Content-Length` header
as the 'CONTENT-LENGTH' member of `pdict` when the header exists.

The problem is that in the Twisted context, unlike the CGI context
(where the web server is responsible for removing any transfer encodings),
that header won't necessarily exist, for example when a chunked transfer
encoding is used by the client.

This commit does the simplest possible thing: synthesize the content
length based on what was actually received.

[1]: https://tools.ietf.org/html/rfc7578#section-4.1
[2]: https://tools.ietf.org/html/rfc3875#section-4.1.2

twm added a commit to twisted/twisted that referenced this pull request

Dec 16, 2019

bpo-29979's PR python/cpython#991 changed
cgi.parse_multipart() to require a 'CONTENT-LENGTH' member in its
`pdict` parameter (which is documented as "dictionary containing other
parameters of the content type header"). The PR seems to have some
issues:

* The change seems to conflate `pdict` with an environment dict,
  probably because the test passes a dict called `env` as `pdict`.
* 'CONTENT-LENGTH' isn't a customary parameter of the Content-Type
  header. `pdict` is intended to pass the multipart boundary [1].
* In a CGI environment the length of the request body is in the
  CONTENT_LENGTH variable, anyway [2]. Typo?
* FieldStorage doesn't require a Content-Length header anyway! It's
  potionally used to enforce cgi.maxlen, which limits the requests size
  (because this is the layer for that, right?).

#1012 dealt with bpo-29979's
backward-incompatible change by passing the HTTP `Content-Length` header
as the 'CONTENT-LENGTH' member of `pdict` when the header exists. When
the header isn't given, Request skips parsing the request content.

The problem is that in the Twisted context, unlike the CGI context
(where the web server is responsible for removing any transfer encodings),
that header won't necessarily exist. For example, when a chunked transfer
encoding is used by the client.

This commit does the simplest possible thing: synthesize the content
length based on what was actually received.

[1]: https://tools.ietf.org/html/rfc7578#section-4.1
[2]: https://tools.ietf.org/html/rfc3875#section-4.1.2

rodrigc pushed a commit to twisted/twisted that referenced this pull request

Mar 28, 2020

bpo-29979's PR python/cpython#991 changed
cgi.parse_multipart() to require a 'CONTENT-LENGTH' member in its
`pdict` parameter (which is documented as "dictionary containing other
parameters of the content type header"). The PR seems to have some
issues:

* The change seems to conflate `pdict` with an environment dict,
  probably because the test passes a dict called `env` as `pdict`.
* 'CONTENT-LENGTH' isn't a customary parameter of the Content-Type
  header. `pdict` is intended to pass the multipart boundary [1].
* In a CGI environment the length of the request body is in the
  CONTENT_LENGTH variable, anyway [2]. Typo?
* FieldStorage doesn't require a Content-Length header anyway! It's
  potionally used to enforce cgi.maxlen, which limits the requests size
  (because this is the layer for that, right?).

#1012 dealt with bpo-29979's
backward-incompatible change by passing the HTTP `Content-Length` header
as the 'CONTENT-LENGTH' member of `pdict` when the header exists. When
the header isn't given, Request skips parsing the request content.

The problem is that in the Twisted context, unlike the CGI context
(where the web server is responsible for removing any transfer encodings),
that header won't necessarily exist. For example, when a chunked transfer
encoding is used by the client.

This commit does the simplest possible thing: synthesize the content
length based on what was actually received.

[1]: https://tools.ietf.org/html/rfc7578#section-4.1
[2]: https://tools.ietf.org/html/rfc3875#section-4.1.2