mirror of
https://github.com/python/cpython.git
synced 2025-10-01 12:52:18 +00:00
[3.6] bpo-32614: Modify re examples to use a raw string to prevent wa… …rning (GH-5265) (GH-5500)
Modify RE examples in documentation to use raw strings to prevent DeprecationWarning.
Add text to REGEX HOWTO to highlight the deprecation. Approved by Serhiy Storchaka.
(cherry picked from commit 66771422d0
)
This commit is contained in:
parent
f61951b10c
commit
fbf8e823c0
4 changed files with 26 additions and 8 deletions
|
@ -289,6 +289,8 @@ Putting REs in strings keeps the Python language simpler, but has one
|
||||||
disadvantage which is the topic of the next section.
|
disadvantage which is the topic of the next section.
|
||||||
|
|
||||||
|
|
||||||
|
.. _the-backslash-plague:
|
||||||
|
|
||||||
The Backslash Plague
|
The Backslash Plague
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
|
@ -327,6 +329,13 @@ backslashes are not handled in any special way in a string literal prefixed with
|
||||||
while ``"\n"`` is a one-character string containing a newline. Regular
|
while ``"\n"`` is a one-character string containing a newline. Regular
|
||||||
expressions will often be written in Python code using this raw string notation.
|
expressions will often be written in Python code using this raw string notation.
|
||||||
|
|
||||||
|
In addition, special escape sequences that are valid in regular expressions,
|
||||||
|
but not valid as Python string literals, now result in a
|
||||||
|
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`,
|
||||||
|
which means the sequences will be invalid if raw string notation or escaping
|
||||||
|
the backslashes isn't used.
|
||||||
|
|
||||||
|
|
||||||
+-------------------+------------------+
|
+-------------------+------------------+
|
||||||
| Regular String | Raw string |
|
| Regular String | Raw string |
|
||||||
+===================+==================+
|
+===================+==================+
|
||||||
|
@ -457,12 +466,18 @@ In actual programs, the most common style is to store the
|
||||||
Two pattern methods return all of the matches for a pattern.
|
Two pattern methods return all of the matches for a pattern.
|
||||||
:meth:`~re.pattern.findall` returns a list of matching strings::
|
:meth:`~re.pattern.findall` returns a list of matching strings::
|
||||||
|
|
||||||
>>> p = re.compile('\d+')
|
>>> p = re.compile(r'\d+')
|
||||||
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
|
>>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping')
|
||||||
['12', '11', '10']
|
['12', '11', '10']
|
||||||
|
|
||||||
:meth:`~re.pattern.findall` has to create the entire list before it can be returned as the
|
The ``r`` prefix, making the literal a raw string literal, is needed in this
|
||||||
result. The :meth:`~re.pattern.finditer` method returns a sequence of
|
example because escape sequences in a normal "cooked" string literal that are
|
||||||
|
not recognized by Python, as opposed to regular expressions, now result in a
|
||||||
|
:exc:`DeprecationWarning` and will eventually become a :exc:`SyntaxError`. See
|
||||||
|
:ref:`the-backslash-plague`.
|
||||||
|
|
||||||
|
:meth:`~re.Pattern.findall` has to create the entire list before it can be returned as the
|
||||||
|
result. The :meth:`~re.Pattern.finditer` method returns a sequence of
|
||||||
:ref:`match object <match-objects>` instances as an :term:`iterator`::
|
:ref:`match object <match-objects>` instances as an :term:`iterator`::
|
||||||
|
|
||||||
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
|
>>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...')
|
||||||
|
@ -1096,11 +1111,11 @@ following calls::
|
||||||
The module-level function :func:`re.split` adds the RE to be used as the first
|
The module-level function :func:`re.split` adds the RE to be used as the first
|
||||||
argument, but is otherwise the same. ::
|
argument, but is otherwise the same. ::
|
||||||
|
|
||||||
>>> re.split('[\W]+', 'Words, words, words.')
|
>>> re.split(r'[\W]+', 'Words, words, words.')
|
||||||
['Words', 'words', 'words', '']
|
['Words', 'words', 'words', '']
|
||||||
>>> re.split('([\W]+)', 'Words, words, words.')
|
>>> re.split(r'([\W]+)', 'Words, words, words.')
|
||||||
['Words', ', ', 'words', ', ', 'words', '.', '']
|
['Words', ', ', 'words', ', ', 'words', '.', '']
|
||||||
>>> re.split('[\W]+', 'Words, words, words.', 1)
|
>>> re.split(r'[\W]+', 'Words, words, words.', 1)
|
||||||
['Words', 'words, words.']
|
['Words', 'words, words.']
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -463,7 +463,7 @@ The string in this example has the number 57 written in both Thai and
|
||||||
Arabic numerals::
|
Arabic numerals::
|
||||||
|
|
||||||
import re
|
import re
|
||||||
p = re.compile('\d+')
|
p = re.compile(r'\d+')
|
||||||
|
|
||||||
s = "Over \u0e55\u0e57 57 flavours"
|
s = "Over \u0e55\u0e57 57 flavours"
|
||||||
m = p.search(s)
|
m = p.search(s)
|
||||||
|
|
|
@ -315,7 +315,7 @@ The special characters are:
|
||||||
|
|
||||||
This example looks for a word following a hyphen:
|
This example looks for a word following a hyphen:
|
||||||
|
|
||||||
>>> m = re.search('(?<=-)\w+', 'spam-egg')
|
>>> m = re.search(r'(?<=-)\w+', 'spam-egg')
|
||||||
>>> m.group(0)
|
>>> m.group(0)
|
||||||
'egg'
|
'egg'
|
||||||
|
|
||||||
|
|
|
@ -0,0 +1,3 @@
|
||||||
|
Modify RE examples in documentation to use raw strings to prevent
|
||||||
|
:exc:`DeprecationWarning` and add text to REGEX HOWTO to highlight the
|
||||||
|
deprecation.
|
Loading…
Add table
Add a link
Reference in a new issue