Issue #18960: Fix bugs with Python source code encoding in the second line.

* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.

* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.

* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.

* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.

* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.

* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
This commit is contained in:
Serhiy Storchaka 2014-01-09 18:36:09 +02:00
parent 21e7d4cd5e
commit 768c16ce02
7 changed files with 87 additions and 5 deletions

View file

@ -33,6 +33,7 @@ except ImportError:
decl_re = re.compile(rb'^[ \t\f]*#.*coding[:=][ \t]*([-\w.]+)')
blank_re = re.compile(rb'^[ \t\f]*(?:[#\r\n]|$)')
def get_declaration(line):
match = decl_re.match(line)
@ -58,7 +59,8 @@ def needs_declaration(fullpath):
line1 = infile.readline()
line2 = infile.readline()
if get_declaration(line1) or get_declaration(line2):
if (get_declaration(line1) or
blank_re.match(line1) and get_declaration(line2)):
# the file does have an encoding declaration, so trust it
return False