Merge #14291: if a header has non-ascii unicode, default to CTE using utf-8

In Python2, if a unicode string was assigned as the value of a header,
email would automatically CTE encode it using the UTF8 charset.
This capability was lost in the Python3 translation, and this patch
restores it.

Patch by Ali Ikinci, assisted by R. David Murray.

I also added a fix for the mailbox test that was depending (with a comment
that it was a bad idea to so depend) on non-ASCII causing message_from_string
to raise an error.  It now uses support.patch to induce an error during
message serialization.
This commit is contained in:
R David Murray 2012-03-14 03:03:27 -04:00
commit e2922835b0
5 changed files with 33 additions and 7 deletions

View file

@ -283,7 +283,12 @@ class Header:
# character set, otherwise an early error is thrown.
output_charset = charset.output_codec or 'us-ascii'
if output_charset != _charset.UNKNOWN8BIT:
s.encode(output_charset, errors)
try:
s.encode(output_charset, errors)
except UnicodeEncodeError:
if output_charset!='us-ascii':
raise
charset = UTF8
self._chunks.append((s, charset))
def encode(self, splitchars=';, \t', maxlinelen=None, linesep='\n'):