Message 303612 - Python tracker

Message303612

Author methane
Recipients ezio.melotti, methane, mrabarnett
Date 2017-10-03.12:58:01
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1507035482.04.0.213398074469.issue31677@psf.upfronthosting.co.za>
In-reply-to
Content
email.header has this pattern:

https://github.com/python/cpython/blob/85c0b8941f0c8ef3ed787c9d504712c6ad3eb5d3/Lib/email/header.py#L34-L43

# Match encoded-word strings in the form =?charset?q?Hello_World?=                       
ecre = re.compile(r'''                                                                   
  =\?                   # literal =?                                                     
  (?P<charset>[^?]*?)   # non-greedy up to the next ? is the charset                     
  \?                    # literal ?                                                      
  (?P<encoding>[qb])    # either a "q" or a "b", case insensitive                        
  \?                    # literal ?                                                      
  (?P<encoded>.*?)      # non-greedy up to the next ?= is the encoded string             
  \?=                   # literal ?=                                                     
  ''', re.VERBOSE | re.IGNORECASE | re.MULTILINE)


Since only 's' and 'i' has other lower case character, this is not a real bug.
But using re.ASCII is more safe.

Additionally, email.util has same pattern from 10 years ago, and it is not used by anywhere.
It should be removed.
History
Date User Action Args
2017-10-03 12:58:02methanesetrecipients: + methane, ezio.melotti, mrabarnett
2017-10-03 12:58:02methanesetmessageid: <1507035482.04.0.213398074469.issue31677@psf.upfronthosting.co.za>
2017-10-03 12:58:02methanelinkissue31677 messages
2017-10-03 12:58:01methanecreate