mirror of
https://github.com/python/cpython.git
synced 2025-08-03 00:23:06 +00:00
SF bug #1582282; decode_header() incorrectly splits not-conformant RFC
2047-like headers where there is no whitespace between encoded words. This fix changes the matching regexp to include a trailing lookahead assertion that the closing ?= must be followed by whitespace, newline, or end-of-string. This also changes the regexp to add the MULTILINE flag.
This commit is contained in:
parent
47c52a8b60
commit
dcd24ae501
3 changed files with 26 additions and 1 deletions
|
@ -39,7 +39,8 @@ ecre = re.compile(r'''
|
|||
\? # literal ?
|
||||
(?P<encoded>.*?) # non-greedy up to the next ?= is the encoded string
|
||||
\?= # literal ?=
|
||||
''', re.VERBOSE | re.IGNORECASE)
|
||||
(?=[ \t]|$) # whitespace or the end of the string
|
||||
''', re.VERBOSE | re.IGNORECASE | re.MULTILINE)
|
||||
|
||||
# Field name regexp, including trailing colon, but not separating whitespace,
|
||||
# according to RFC 2822. Character range is from tilde to exclamation mark.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue