closes bpo-31650: PEP 552 (Deterministic pycs) implementation (#4575)

Python now supports checking bytecode cache up-to-dateness with a hash of the
source contents rather than volatile source metadata. See the PEP for details.

While a fairly straightforward idea, quite a lot of code had to be modified due
to the pervasiveness of pyc implementation details in the codebase. Changes in
this commit include:

- The core changes to importlib to understand how to read, validate, and
  regenerate hash-based pycs.

- Support for generating hash-based pycs in py_compile and compileall.

- Modifications to our siphash implementation to support passing a custom
  key. We then expose it to importlib through _imp.

- Updates to all places in the interpreter, standard library, and tests that
  manually generate or parse pyc files to grok the new format.

- Support in the interpreter command line code for long options like
  --check-hash-based-pycs.

- Tests and documentation for all of the above.
This commit is contained in:
Benjamin Peterson 2017-12-09 10:26:52 -08:00 committed by GitHub
parent 28d8d14013
commit 42aa93b8ff
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
33 changed files with 3364 additions and 2505 deletions

View file

@ -27,7 +27,7 @@ byte-code cache files in the directory containing the source code.
Exception raised when an error occurs while attempting to compile the file.
.. function:: compile(file, cfile=None, dfile=None, doraise=False, optimize=-1)
.. function:: compile(file, cfile=None, dfile=None, doraise=False, optimize=-1, invalidation_mode=PycInvalidationMode.TIMESTAMP)
Compile a source file to byte-code and write out the byte-code cache file.
The source code is loaded from the file named *file*. The byte-code is
@ -53,6 +53,10 @@ byte-code cache files in the directory containing the source code.
:func:`compile` function. The default of ``-1`` selects the optimization
level of the current interpreter.
*invalidation_mode* should be a member of the :class:`PycInvalidationMode`
enum and controls how the generated ``.pyc`` files are invalidated at
runtime.
.. versionchanged:: 3.2
Changed default value of *cfile* to be :PEP:`3147`-compliant. Previous
default was *file* + ``'c'`` (``'o'`` if optimization was enabled).
@ -65,6 +69,41 @@ byte-code cache files in the directory containing the source code.
caveat that :exc:`FileExistsError` is raised if *cfile* is a symlink or
non-regular file.
.. versionchanged:: 3.7
The *invalidation_mode* parameter was added as specified in :pep:`552`.
.. class:: PycInvalidationMode
A enumeration of possible methods the interpreter can use to determine
whether a bytecode file is up to date with a source file. The ``.pyc`` file
indicates the desired invalidation mode in its header. See
:ref:`pyc-invalidation` for more information on how Python invalidates
``.pyc`` files at runtime.
.. versionadded:: 3.7
.. attribute:: TIMESTAMP
The ``.pyc`` file includes the timestamp and size of the source file,
which Python will compare against the metadata of the source file at
runtime to determine if the ``.pyc`` file needs to be regenerated.
.. attribute:: CHECKED_HASH
The ``.pyc`` file includes a hash of the source file content, which Python
will compare against the source at runtime to determine if the ``.pyc``
file needs to be regenerated.
.. attribute:: UNCHECKED_HASH
Like :attr:`CHECKED_HASH`, the ``.pyc`` file includes a hash of the source
file content. However, Python will at runtime assume the ``.pyc`` file is
up to date and not validate the ``.pyc`` against the source file at all.
This option is useful when the ``.pycs`` are kept up to date by some
system external to Python like a build system.
.. function:: main(args=None)