Issue #3665: \u and \U escapes are now supported in unicode regular expressions.

Patch by Serhiy Storchaka.
This commit is contained in:
Antoine Pitrou 2012-06-23 13:29:19 +02:00
parent c9aa8425c4
commit 463badf06c
4 changed files with 144 additions and 34 deletions

View file

@ -414,17 +414,24 @@ Most of the standard escapes supported by Python string literals are also
accepted by the regular expression parser::
\a \b \f \n
\r \t \v \x
\\
\r \t \u \U
\v \x \\
(Note that ``\b`` is used to represent word boundaries, and means "backspace"
only inside character classes.)
``'\u'`` and ``'\U'`` escape sequences are only recognized in Unicode
patterns. In bytes patterns they are not treated specially.
Octal escapes are included in a limited form. If the first digit is a 0, or if
there are three octal digits, it is considered an octal escape. Otherwise, it is
a group reference. As for string literals, octal escapes are always at most
three digits in length.
.. versionchanged:: 3.3
The ``'\u'`` and ``'\U'`` escape sequences have been added.
.. _contents-of-module-re: