Issue #22818: Splitting on a pattern that could match an empty string now

raises a warning.  Patterns that can only match empty strings are now
rejected.
This commit is contained in:
Serhiy Storchaka 2015-02-03 11:04:19 +02:00
parent 32ca3dcb97
commit 83e802796c
6 changed files with 85 additions and 20 deletions

View file

@ -626,17 +626,37 @@ form.
That way, separator components are always found at the same relative
indices within the result list.
Note that *split* will never split a string on an empty pattern match.
For example:
.. note::
>>> re.split('x*', 'foo')
['foo']
>>> re.split("(?m)^$", "foo\n\nbar\n")
['foo\n\nbar\n']
:func:`split` doesn't currently split a string on an empty pattern match.
For example:
>>> re.split('x*', 'axbc')
['a', 'bc']
Even though ``'x*'`` also matches 0 'x' before 'a', between 'b' and 'c',
and after 'c', currently these matches are ignored. The correct behavior
(i.e. splitting on empty matches too and returning ``['', 'a', 'b', 'c',
'']``) will be implemented in future versions of Python, but since this
is a backward incompatible change, a :exc:`FutureWarning` will be raised
in the meanwhile.
Patterns that can only match empty strings currently never split the
string. Since this doesn't match the expected behavior, a
:exc:`ValueError` will be raised starting from Python 3.5::
>>> re.split("^$", "foo\n\nbar\n", flags=re.M)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
...
ValueError: split() requires a non-empty pattern match.
.. versionchanged:: 3.1
Added the optional flags argument.
.. versionchanged:: 3.5
Splitting on a pattern that could match an empty string now raises
a warning. Patterns that can only match empty strings are now rejected.
.. function:: findall(pattern, string, flags=0)