Issue36180
Created on 2019-03-04 11:03 by enrico, last changed 2022-04-11 14:59 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| broken.zip | enrico, 2019-03-04 11:03 | |||
| Messages (2) | |||
|---|---|---|---|
| msg337091 - (view) | Author: Enrico Zini (enrico) | Date: 2019-03-04 11:03 | |
This simple code:
```
import mailbox
mbox = mailbox.mbox("broken.mbox")
for msg in mbox:
msg.get_payload()
```
Fails rather unexpectedly:
```
$ python3 broken.py
Traceback (most recent call last):
File "broken.py", line 5, in <module>
msg.get_payload()
File "/usr/lib/python3.7/email/message.py", line 267, in get_payload
payload = bpayload.decode(self.get_param('charset', 'ascii'), 'replace')
TypeError: decode() argument 1 must be str, not tuple
```
(I'm attaching a zip with code and mailbox)
I would have expected either that the part past text/plain is ignored if it doesn't make sense, or that content-type is completely ignored.
I have to process a large mailbox archive, and this is currently how I had to work around this issue, and it's causing me to have to skip email content which would otherwise be reasonably accessible:
https://salsa.debian.org/nm-team/echelon/commit/617ce935a31f6256257ffb24e11a5666306406c3
|
|||
| msg340187 - (view) | Author: Karthikeyan Singaravelan (xtreak) * ![]() |
Date: 2019-04-14 06:40 | |
A simplified reproducer as below. The tuple is returned from here https://github.com/python/cpython/blob/830b43d03cc47a27a22a50d777f23c8e60820867/Lib/email/message.py#L93 and perhaps is an untested code path? The charset gets a tuple value of ('utf-8��', '', '"utf-8Â\xa0"') . import mailbox import tempfile broken_message = """ From list@murphy.debian.org Wed Sep 24 01:22:15 2003 Date: Wed, 24 Sep 2003 07:05:50 +0200 From: Test test <test@example.or> To: debian-devel-french@lists.debian.org Subject: Re: Test Mime-Version: 1.0 Content-Type: text/plain; charset*=utf-8†''utf-8%C2%A0 tr√©s int√©ress√© """ with tempfile.NamedTemporaryFile() as f: f.write(broken_message.encode()) f.seek(0) msg = mailbox.mbox(f.name) for m in msg: print(m.get_payload()) $ ../cpython/python.exe bpo36180.py Traceback (most recent call last): File "bpo36180.py", line 21, in <module> print(m.get_payload()) File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/email/message.py", line 267, in get_payload payload = bpayload.decode(self.get_param('charset', 'ascii'), 'replace') TypeError: decode() argument 1 must be str, not tuple sys:1: ResourceWarning: unclosed file <_io.BufferedRandom name='/var/folders/2b/mhgtnnpx4z943t4cc9yvw4qw0000gn/T/tmp4ddavb6g'> |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022-04-11 14:59:11 | admin | set | github: 80361 |
| 2019-04-14 06:40:20 | xtreak | set | nosy:
+ xtreak messages: + msg340187 |
| 2019-03-04 11:53:02 | SilentGhost | set | versions:
+ Python 3.7, Python 3.8 nosy: + barry, r.david.murray components:
+ email |
| 2019-03-04 11:04:54 | mapreri | set | nosy:
+ mapreri |
| 2019-03-04 11:03:37 | enrico | create | |
