[3.13] gh-140797: Forbid capturing groups in re.Scanner lexicon patterns (GH-140944) (GH-140983)
Some checks are pending
Tests / Change detection (push) Waiting to run
Tests / Docs (push) Blocked by required conditions
Tests / Check if the ABI has changed (push) Blocked by required conditions
Tests / Check if Autoconf files are up to date (push) Blocked by required conditions
Tests / Check if generated files are up to date (push) Blocked by required conditions
Tests / (push) Blocked by required conditions
Tests / Windows MSI (push) Blocked by required conditions
Tests / Ubuntu SSL tests with OpenSSL (push) Blocked by required conditions
Tests / Android (aarch64) (push) Blocked by required conditions
Tests / Android (x86_64) (push) Blocked by required conditions
Tests / WASI (push) Blocked by required conditions
Tests / Hypothesis tests on Ubuntu (push) Blocked by required conditions
Tests / Address sanitizer (push) Blocked by required conditions
Tests / Sanitizers (push) Blocked by required conditions
Tests / CIFuzz (push) Blocked by required conditions
Tests / All required checks pass (push) Blocked by required conditions
Lint / lint (push) Waiting to run

(cherry picked from commit fa9c3eefd4)

Co-authored-by: Abhishek Tiwari <Abhi210@users.noreply.github.com>
This commit is contained in:
Miss Islington (bot) 2025-11-04 12:17:29 +01:00 committed by GitHub
parent e7507967f8
commit ee894d2abb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 24 additions and 1 deletions

View file

@ -399,9 +399,12 @@ class Scanner:
s = _parser.State()
s.flags = flags
for phrase, action in lexicon:
sub_pattern = _parser.parse(phrase, flags)
if sub_pattern.state.groups != 1:
raise ValueError("Cannot use capturing groups in re.Scanner")
gid = s.opengroup()
p.append(_parser.SubPattern(s, [
(SUBPATTERN, (gid, 0, 0, _parser.parse(phrase, flags))),
(SUBPATTERN, (gid, 0, 0, sub_pattern)),
]))
s.closegroup(gid, p[-1])
p = _parser.SubPattern(s, [(BRANCH, (None, p))])

View file

@ -1638,6 +1638,24 @@ class ReTests(unittest.TestCase):
(['sum', 'op=', 3, 'op*', 'foo', 'op+', 312.5,
'op+', 'bar'], ''))
def test_bug_gh140797(self):
# gh140797: Capturing groups are not allowed in re.Scanner
msg = r"Cannot use capturing groups in re\.Scanner"
# Capturing group throws an error
with self.assertRaisesRegex(ValueError, msg):
Scanner([("(a)b", None)])
# Named Group
with self.assertRaisesRegex(ValueError, msg):
Scanner([("(?P<name>a)", None)])
# Non-capturing groups should pass normally
s = Scanner([("(?:a)b", lambda scanner, token: token)])
result, rem = s.scan("ab")
self.assertEqual(result,['ab'])
self.assertEqual(rem,'')
def test_bug_448951(self):
# bug 448951 (similar to 429357, but with single char match)
# (Also test greedy matches.)

View file

@ -0,0 +1,2 @@
The undocumented :class:`!re.Scanner` class now forbids regular expressions containing capturing groups in its lexicon patterns. Patterns using capturing groups could
previously lead to crashes with segmentation fault. Use non-capturing groups (?:...) instead.