gh-121650: Encode newlines in headers, and verify headers are sound (GH-122233)

## Encode header parts that contain newlines

Per RFC 2047:

> [...] these encoding schemes allow the
> encoding of arbitrary octet values, mail readers that implement this
> decoding should also ensure that display of the decoded data on the
> recipient's terminal will not cause unwanted side-effects

It seems that the "quoted-word" scheme is a valid way to include
a newline character in a header value, just like we already allow
undecodable bytes or control characters.
They do need to be properly quoted when serialized to text, though.


## Verify that email headers are well-formed

This should fail for custom fold() implementations that aren't careful
about newlines.


Co-authored-by: Bas Bloemsaat <bas@bloemsaat.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This commit is contained in:
Petr Viktorin 2024-07-31 00:19:48 +02:00 committed by GitHub
parent 5912487938
commit 0976339818
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
10 changed files with 160 additions and 4 deletions

View file

@ -29,6 +29,10 @@ class CharsetError(MessageError):
"""An illegal charset was given."""
class HeaderWriteError(MessageError):
"""Error while writing headers."""
# These are parsing defects which the parser was able to work around.
class MessageDefect(ValueError):
"""Base class for a message defect."""