mirror of
https://github.com/python/cpython.git
synced 2025-08-04 00:48:58 +00:00
bpo-47081: Replace "qualifiers" with "quantifiers" in the re module documentation (GH-32028)
It is a more commonly used term.
This commit is contained in:
parent
4f97d64c83
commit
c6cd3cc93c
5 changed files with 21 additions and 21 deletions
|
@ -230,13 +230,13 @@ while ``+`` requires at least *one* occurrence. To use a similar example,
|
|||
``ca+t`` will match ``'cat'`` (1 ``'a'``), ``'caaat'`` (3 ``'a'``\ s), but won't
|
||||
match ``'ct'``.
|
||||
|
||||
There are two more repeating qualifiers. The question mark character, ``?``,
|
||||
There are two more repeating operators or quantifiers. The question mark character, ``?``,
|
||||
matches either once or zero times; you can think of it as marking something as
|
||||
being optional. For example, ``home-?brew`` matches either ``'homebrew'`` or
|
||||
``'home-brew'``.
|
||||
|
||||
The most complicated repeated qualifier is ``{m,n}``, where *m* and *n* are
|
||||
decimal integers. This qualifier means there must be at least *m* repetitions,
|
||||
The most complicated quantifier is ``{m,n}``, where *m* and *n* are
|
||||
decimal integers. This quantifier means there must be at least *m* repetitions,
|
||||
and at most *n*. For example, ``a/{1,3}b`` will match ``'a/b'``, ``'a//b'``, and
|
||||
``'a///b'``. It won't match ``'ab'``, which has no slashes, or ``'a////b'``, which
|
||||
has four.
|
||||
|
@ -245,7 +245,7 @@ You can omit either *m* or *n*; in that case, a reasonable value is assumed for
|
|||
the missing value. Omitting *m* is interpreted as a lower limit of 0, while
|
||||
omitting *n* results in an upper bound of infinity.
|
||||
|
||||
Readers of a reductionist bent may notice that the three other qualifiers can
|
||||
Readers of a reductionist bent may notice that the three other quantifiers can
|
||||
all be expressed using this notation. ``{0,}`` is the same as ``*``, ``{1,}``
|
||||
is equivalent to ``+``, and ``{0,1}`` is the same as ``?``. It's better to use
|
||||
``*``, ``+``, or ``?`` when you can, simply because they're shorter and easier
|
||||
|
@ -803,7 +803,7 @@ which matches the header's value.
|
|||
Groups are marked by the ``'('``, ``')'`` metacharacters. ``'('`` and ``')'``
|
||||
have much the same meaning as they do in mathematical expressions; they group
|
||||
together the expressions contained inside them, and you can repeat the contents
|
||||
of a group with a repeating qualifier, such as ``*``, ``+``, ``?``, or
|
||||
of a group with a quantifier, such as ``*``, ``+``, ``?``, or
|
||||
``{m,n}``. For example, ``(ab)*`` will match zero or more repetitions of
|
||||
``ab``. ::
|
||||
|
||||
|
@ -1326,7 +1326,7 @@ backtrack character by character until it finds a match for the ``>``. The
|
|||
final match extends from the ``'<'`` in ``'<html>'`` to the ``'>'`` in
|
||||
``'</title>'``, which isn't what you want.
|
||||
|
||||
In this case, the solution is to use the non-greedy qualifiers ``*?``, ``+?``,
|
||||
In this case, the solution is to use the non-greedy quantifiers ``*?``, ``+?``,
|
||||
``??``, or ``{m,n}?``, which match as *little* text as possible. In the above
|
||||
example, the ``'>'`` is tried immediately after the first ``'<'`` matches, and
|
||||
when it fails, the engine advances a character at a time, retrying the ``'>'``
|
||||
|
|
|
@ -87,7 +87,7 @@ Some characters, like ``'|'`` or ``'('``, are special. Special
|
|||
characters either stand for classes of ordinary characters, or affect
|
||||
how the regular expressions around them are interpreted.
|
||||
|
||||
Repetition qualifiers (``*``, ``+``, ``?``, ``{m,n}``, etc) cannot be
|
||||
Repetition operators or quantifiers (``*``, ``+``, ``?``, ``{m,n}``, etc) cannot be
|
||||
directly nested. This avoids ambiguity with the non-greedy modifier suffix
|
||||
``?``, and with other modifiers in other implementations. To apply a second
|
||||
repetition to an inner repetition, parentheses may be used. For example,
|
||||
|
@ -146,10 +146,10 @@ The special characters are:
|
|||
single: ??; in regular expressions
|
||||
|
||||
``*?``, ``+?``, ``??``
|
||||
The ``'*'``, ``'+'``, and ``'?'`` qualifiers are all :dfn:`greedy`; they match
|
||||
The ``'*'``, ``'+'``, and ``'?'`` quantifiers are all :dfn:`greedy`; they match
|
||||
as much text as possible. Sometimes this behaviour isn't desired; if the RE
|
||||
``<.*>`` is matched against ``'<a> b <c>'``, it will match the entire
|
||||
string, and not just ``'<a>'``. Adding ``?`` after the qualifier makes it
|
||||
string, and not just ``'<a>'``. Adding ``?`` after the quantifier makes it
|
||||
perform the match in :dfn:`non-greedy` or :dfn:`minimal` fashion; as *few*
|
||||
characters as possible will be matched. Using the RE ``<.*?>`` will match
|
||||
only ``'<a>'``.
|
||||
|
@ -160,11 +160,11 @@ The special characters are:
|
|||
single: ?+; in regular expressions
|
||||
|
||||
``*+``, ``++``, ``?+``
|
||||
Like the ``'*'``, ``'+'``, and ``'?'`` qualifiers, those where ``'+'`` is
|
||||
Like the ``'*'``, ``'+'``, and ``'?'`` quantifiers, those where ``'+'`` is
|
||||
appended also match as many times as possible.
|
||||
However, unlike the true greedy qualifiers, these do not allow
|
||||
However, unlike the true greedy quantifiers, these do not allow
|
||||
back-tracking when the expression following it fails to match.
|
||||
These are known as :dfn:`possessive` qualifiers.
|
||||
These are known as :dfn:`possessive` quantifiers.
|
||||
For example, ``a*a`` will match ``'aaaa'`` because the ``a*`` will match
|
||||
all 4 ``'a'``s, but, when the final ``'a'`` is encountered, the
|
||||
expression is backtracked so that in the end the ``a*`` ends up matching
|
||||
|
@ -198,7 +198,7 @@ The special characters are:
|
|||
``{m,n}?``
|
||||
Causes the resulting RE to match from *m* to *n* repetitions of the preceding
|
||||
RE, attempting to match as *few* repetitions as possible. This is the
|
||||
non-greedy version of the previous qualifier. For example, on the
|
||||
non-greedy version of the previous quantifier. For example, on the
|
||||
6-character string ``'aaaaaa'``, ``a{3,5}`` will match 5 ``'a'`` characters,
|
||||
while ``a{3,5}?`` will only match 3 characters.
|
||||
|
||||
|
@ -206,7 +206,7 @@ The special characters are:
|
|||
Causes the resulting RE to match from *m* to *n* repetitions of the
|
||||
preceding RE, attempting to match as many repetitions as possible
|
||||
*without* establishing any backtracking points.
|
||||
This is the possessive version of the qualifier above.
|
||||
This is the possessive version of the quantifier above.
|
||||
For example, on the 6-character string ``'aaaaaa'``, ``a{3,5}+aa``
|
||||
attempt to match 5 ``'a'`` characters, then, requiring 2 more ``'a'``s,
|
||||
will need more characters than available and thus fail, while
|
||||
|
|
|
@ -298,7 +298,7 @@ os
|
|||
re
|
||||
--
|
||||
|
||||
* Atomic grouping (``(?>...)``) and possessive qualifiers (``*+``, ``++``,
|
||||
* Atomic grouping (``(?>...)``) and possessive quantifiers (``*+``, ``++``,
|
||||
``?+``, ``{m,n}+``) are now supported in regular expressions.
|
||||
(Contributed by Jeffrey C. Jacobs and Serhiy Storchaka in :issue:`433030`.)
|
||||
|
||||
|
|
|
@ -2038,9 +2038,9 @@ class ReTests(unittest.TestCase):
|
|||
with self.assertRaisesRegex(TypeError, "got 'type'"):
|
||||
re.search("x*", type)
|
||||
|
||||
def test_possessive_qualifiers(self):
|
||||
"""Test Possessive Qualifiers
|
||||
Test qualifiers of the form @+ for some repetition operator @,
|
||||
def test_possessive_quantifiers(self):
|
||||
"""Test Possessive Quantifiers
|
||||
Test quantifiers of the form @+ for some repetition operator @,
|
||||
e.g. x{3,5}+ meaning match from 3 to 5 greadily and proceed
|
||||
without creating a stack frame for rolling the stack back and
|
||||
trying 1 or more fewer matches."""
|
||||
|
@ -2077,7 +2077,7 @@ class ReTests(unittest.TestCase):
|
|||
self.assertIsNone(re.match("^x{}+$", "xxx"))
|
||||
self.assertTrue(re.match("^x{}+$", "x{}"))
|
||||
|
||||
def test_fullmatch_possessive_qualifiers(self):
|
||||
def test_fullmatch_possessive_quantifiers(self):
|
||||
self.assertTrue(re.fullmatch(r'a++', 'a'))
|
||||
self.assertTrue(re.fullmatch(r'a*+', 'a'))
|
||||
self.assertTrue(re.fullmatch(r'a?+', 'a'))
|
||||
|
@ -2096,7 +2096,7 @@ class ReTests(unittest.TestCase):
|
|||
self.assertIsNone(re.fullmatch(r'(?:ab)?+', 'abc'))
|
||||
self.assertIsNone(re.fullmatch(r'(?:ab){1,3}+', 'abc'))
|
||||
|
||||
def test_findall_possessive_qualifiers(self):
|
||||
def test_findall_possessive_quantifiers(self):
|
||||
self.assertEqual(re.findall(r'a++', 'aab'), ['aa'])
|
||||
self.assertEqual(re.findall(r'a*+', 'aab'), ['aa', '', ''])
|
||||
self.assertEqual(re.findall(r'a?+', 'aab'), ['a', 'a', '', ''])
|
||||
|
|
|
@ -1,2 +1,2 @@
|
|||
Add support of atomic grouping (``(?>...)``) and possessive qualifiers
|
||||
Add support of atomic grouping (``(?>...)``) and possessive quantifiers
|
||||
(``*+``, ``++``, ``?+``, ``{m,n}+``) in :mod:`regular expressions <re>`.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue