mirror of
https://github.com/python/cpython.git
synced 2025-08-04 17:08:35 +00:00
gh-91760: More strict rules for numerical group references and group names in RE (GH-91792)
Only sequence of ASCII digits is now accepted as a numerical reference. The group name in bytes patterns and replacement strings can now only contain ASCII letters and digits and underscore.
This commit is contained in:
parent
7b024e3a3f
commit
a84a56d80f
5 changed files with 62 additions and 91 deletions
|
@ -395,7 +395,8 @@ The special characters are:
|
|||
``(?P<name>...)``
|
||||
Similar to regular parentheses, but the substring matched by the group is
|
||||
accessible via the symbolic group name *name*. Group names must be valid
|
||||
Python identifiers, and each group name must be defined only once within a
|
||||
Python identifiers, and in bytes patterns they must contain only characters
|
||||
in the ASCII range. Each group name must be defined only once within a
|
||||
regular expression. A symbolic group is also a numbered group, just as if
|
||||
the group were not named.
|
||||
|
||||
|
@ -417,8 +418,9 @@ The special characters are:
|
|||
| | * ``\1`` |
|
||||
+---------------------------------------+----------------------------------+
|
||||
|
||||
.. deprecated:: 3.11
|
||||
Group names containing non-ASCII characters in bytes patterns.
|
||||
.. versionchanged:: 3.12
|
||||
In bytes patterns group names must contain only characters in
|
||||
the ASCII range.
|
||||
|
||||
.. index:: single: (?P=; in regular expressions
|
||||
|
||||
|
@ -489,8 +491,8 @@ The special characters are:
|
|||
will match with ``'<user@host.com>'`` as well as ``'user@host.com'``, but
|
||||
not with ``'<user@host.com'`` nor ``'user@host.com>'``.
|
||||
|
||||
.. deprecated:: 3.11
|
||||
Group *id* containing anything except ASCII digits.
|
||||
.. versionchanged:: 3.12
|
||||
Group *id* can only contain ASCII digits.
|
||||
|
||||
|
||||
The special sequences consist of ``'\'`` and a character from the list below.
|
||||
|
@ -1001,9 +1003,10 @@ form.
|
|||
Empty matches for the pattern are replaced when adjacent to a previous
|
||||
non-empty match.
|
||||
|
||||
.. deprecated:: 3.11
|
||||
Group *id* containing anything except ASCII digits.
|
||||
Group names containing non-ASCII characters in bytes replacement strings.
|
||||
.. versionchanged:: 3.12
|
||||
Group *id* can only contain ASCII digits.
|
||||
In bytes replacement strings group names must contain only characters
|
||||
in the ASCII range.
|
||||
|
||||
|
||||
.. function:: subn(pattern, repl, string, count=0, flags=0)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue