gh-98740: Fix validation of conditional expressions in RE (GH-98764)

In very rare circumstances the JUMP opcode could be confused with the
argument of the opcode in the "then" part which doesn't end with the
JUMP opcode. This led to incorrect detection of the final JUMP opcode
and incorrect calculation of the size of the subexpression.

NOTE: Changed return value of functions _validate_inner() and
_validate_charset() in Modules/_sre/sre.c.  Now they return 0 on success,
-1 on failure, and 1 if the last op is JUMP (which usually is a failure).
Previously they returned 1 on success and 0 on failure.
This commit is contained in:
Serhiy Storchaka 2022-11-03 09:23:46 +02:00 committed by GitHub
parent 41bc101dd6
commit e9ac890c02
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
4 changed files with 40 additions and 27 deletions

View file

@ -630,6 +630,11 @@ class ReTests(unittest.TestCase):
self.checkPatternError(r'()(?(2)a)',
"invalid group reference 2", 5)
def test_re_groupref_exists_validation_bug(self):
for i in range(256):
with self.subTest(code=i):
re.compile(r'()(?(1)\x%02x?)' % i)
def test_re_groupref_overflow(self):
from re._constants import MAXGROUPS
self.checkTemplateError('()', r'\g<%s>' % MAXGROUPS, 'xx',