[3.11] gh-94606: Fix error when message with Unicode surrogate not surrogateescaped string (GH-94641) (GH-112972)

(cherry picked from commit 27a5fd8cb8)

Co-authored-by: Sidney Markowitz <sidney@sidney.com>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This commit is contained in:
Miss Islington (bot) 2023-12-11 17:47:25 +01:00 committed by GitHub
parent a37e1473da
commit 5aec2d2452
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 49 additions and 16 deletions

View file

@ -49,10 +49,10 @@ specialsre = re.compile(r'[][\\()<>@,:;".]')
escapesre = re.compile(r'[\\"]')
def _has_surrogates(s):
"""Return True if s contains surrogate-escaped binary data."""
"""Return True if s may contain surrogate-escaped binary data."""
# This check is based on the fact that unless there are surrogates, utf8
# (Python's default encoding) can encode any string. This is the fastest
# way to check for surrogates, see issue 11454 for timings.
# way to check for surrogates, see bpo-11454 (moved to gh-55663) for timings.
try:
s.encode()
return False