cpython/Python
Eric Snow 79cf20e48d
bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656)
Currently frozen modules do not have __file__ set.  In their spec, origin is set to "frozen" and they are marked as not having a location.  (Similarly, for frozen packages __path__ is set to an empty list.)  However, for frozen stdlib modules we are able to extrapolate __file__ as long as we can determine the stdlib directory at runtime.  (We now do so since gh-28586.)  Having __file__ set is helpful for a number of reasons.  Likewise, having a non-empty __path__ means we can import submodules of a frozen package from the filesystem (e.g. we could partially freeze the encodings module).

This change sets __file__ (and adds to __path__) for frozen stdlib modules.  It uses sys._stdlibdir (from gh-28586) and the frozen module alias information (from gh-28655).  All that work is done in FrozenImporter (in Lib/importlib/_bootstrap.py). 
 Also, if a frozen module is imported before importlib is bootstrapped (during interpreter initialization) then we fix up that module and its spec during the importlib bootstrapping step (i.e. imporlib._bootstrap._setup()) to match what gets set by FrozenImporter, including setting the file info (if the stdlib dir is known).  To facilitate this, modules imported using PyImport_ImportFrozenModule() have __origname__ set using the frozen module alias info.  __origname__ is popped off during importlib bootstrap.

(To be clear, even with this change the new code to set __file__ during fixups in imporlib._bootstrap._setup() doesn't actually get triggered yet.  This is because sys._stdlibdir hasn't been set yet in interpreter initialization at the point importlib is bootstrapped.  However, we do fix up such modules at that point to otherwise match the result of importing through FrozenImporter, just not the __file__ and __path__ parts.  Doing so will require changes in the order in which things happen during interpreter initialization.  That can be addressed separately.  Once it is, the file-related fixup code from this PR will kick in.)

Here are things this change does not do:

* set __file__ for non-stdlib modules (no way of knowing the parent dir)
* set __file__ if the stdlib dir is not known (nor assume the expense of finding it)
* relatedly, set __file__ if the stdlib is in a zip file
* verify that the filename set to __file__ actually exists (too expensive)
* update __path__ for frozen packages that alias a non-package (since there is no package dir)

Other things this change skips, but we may do later:

* set __file__ on modules imported using PyImport_ImportFrozenModule()
* set co_filename when we unmarshal the frozen code object while importing the module (e.g. in FrozenImporter.exec_module()) -- this would allow tracebacks to show source lines
* implement FrozenImporter.get_filename() and FrozenImporter.get_source()

https://bugs.python.org/issue21736
2021-10-14 15:32:18 -06:00
..
clinic bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656) 2021-10-14 15:32:18 -06:00
frozen_modules bpo-45020: Drop the frozen .h files from the repo. (gh-28392) 2021-09-16 14:20:52 -06:00
_warnings.c bpo-44590: Lazily allocate frame objects (GH-27077) 2021-07-26 11:22:16 +01:00
adaptive.md bpo-44854: Remove trailing whitespaces (GH-27689) 2021-08-09 21:32:54 +03:00
asdl.c bpo-43244: Remove ast.h, asdl.h, Python-ast.h headers (GH-24933) 2021-03-23 20:47:40 +01:00
ast.c bpo-43897: Reject "_" captures and top-level MatchStar in the AST validator (GH-27432) 2021-07-28 17:24:18 -07:00
ast_opt.c bpo-28307: Tests and fixes for optimization of C-style formatting (GH-26318) 2021-05-23 19:06:48 +03:00
ast_unparse.c bpo-43892: Make match patterns explicit in the AST (GH-25585) 2021-04-28 22:58:44 -07:00
bltinmodule.c pycore_pystate.h no longer redefines PyThreadState_GET() (GH-28921) 2021-10-13 14:09:13 +02:00
bootstrap_hash.c bpo-44611: Use BCryptGenRandom instead of CryptGenRandom on Windows (GH-27168) 2021-07-23 23:04:30 +09:00
ceval.c bpo-45367: Specialize BINARY_MULTIPLY (GH-28727) 2021-10-14 15:56:33 +01:00
ceval_gil.h bpo-43268: Pass interp rather than tstate to internal functions (GH-24580) 2021-02-19 15:10:45 +01:00
codecs.c bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895) 2021-10-12 08:38:19 +02:00
compile.c Fix typos in the Python directory (GH-28767) 2021-10-06 15:55:27 -07:00
condvar.h bpo-44740: Lowercase "internet" and "web" where appropriate. (#27378) 2021-07-27 00:11:55 +02:00
context.c bpo-45439: Move _PyObject_VectorcallTstate() to pycore_call.h (GH-28893) 2021-10-14 21:53:04 +02:00
dtoa.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
dup2.c
dynamic_annotations.c
dynload_hpux.c bpo-44959: Add fallback to extension modules with '.sl' suffix on HP-UX (GH-27857) 2021-09-08 14:43:00 +02:00
dynload_shlib.c bpo-43895: Remove an unnecessary cache of shared object handles (GH-25487) 2021-07-07 16:26:06 -07:00
dynload_stub.c
dynload_win.c bpo-36346: Make using the legacy Unicode C API optional (GH-21437) 2020-07-10 23:26:06 +03:00
errors.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
fileutils.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
formatter_unicode.c bpo-20524: adds better error message for .format() (GH-28310) 2021-09-24 11:18:04 -04:00
frame.c bpo-44990: Change layout of evaluation frames. "Layout B" (GH-27933) 2021-08-25 13:44:20 +01:00
frozen.c bpo-45020: Identify which frozen modules are actually aliases. (gh-28655) 2021-10-05 11:26:37 -06:00
frozenmain.c bpo-44131: Py_FrozenMain() uses PyConfig_SetBytesArgv() (GH-26201) 2021-05-20 12:08:05 +02:00
future.c bpo-38605: Revert making 'from __future__ import annotations' the default (GH-25490) 2021-04-21 12:41:19 +01:00
getargs.c bpo-20291: Fix MSVC warnings in getargs.c (GH-27211) 2021-07-17 14:09:18 +03:00
getcompiler.c closes bpo-43278: remove unnecessary leading '\n' from COMPILER when build with GCC/Clang (GH-24606) 2021-02-25 20:24:21 -08:00
getcopyright.c Bring Python into the new year. (GH-24036) 2021-01-02 00:37:23 +09:00
getopt.c bpo-40527: Fix command line argument parsing (GH-19955) 2020-05-06 22:22:17 +09:00
getplatform.c
getversion.c
hamt.c bpo-29882: Add _Py_popcount32() function (GH-20518) 2020-06-08 16:30:33 +02:00
hashtable.c bpo-41061: Fix incorrect expressions in hashtable (GH-21028) 2020-06-22 00:41:48 -07:00
import.c bpo-21736: Set __file__ on frozen stdlib modules. (gh-28656) 2021-10-14 15:32:18 -06:00
importdl.c Fix format string in _PyImport_LoadDynamicModuleWithSpec() (GH-28863) 2021-10-12 10:20:04 +03:00
importdl.h
initconfig.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
makeopcodetargets.py bpo-43760: Check for tracing using 'bitwise or' instead of branch in dispatch. (GH-28723) 2021-10-05 11:01:11 +01:00
marshal.c bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895) 2021-10-12 08:38:19 +02:00
modsupport.c bpo-1635741: Add PyModule_AddObjectRef() function (GH-23122) 2020-11-04 13:59:15 +01:00
mysnprintf.c bpo-36020: Require vsnprintf() to build Python (GH-20899) 2020-06-16 00:54:44 +02:00
mystrtoul.c bpo-37752: Delete redundant Py_CHARMASK in normalizestring() (GH-15095) 2019-09-10 17:04:08 +01:00
opcode_targets.h bpo-45367: Specialize BINARY_MULTIPLY (GH-28727) 2021-10-14 15:56:33 +01:00
pathconfig.c bpo-45471: Do not set PyConfig.stdlib_dir in Py_SetPythonHome(). (gh-28954) 2021-10-14 14:48:32 -06:00
preconfig.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
pyarena.c bpo-43244: Remove the pyarena.h header (GH-25007) 2021-03-24 02:23:01 +01:00
pyctype.c
pyfpe.c
pyhash.c bpo-29410: Change the default hash algorithm to SipHash13. (GH-28752) 2021-10-10 17:29:46 +09:00
pylifecycle.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
pymath.c bpo-45440: Require math.h isinf() to build (GH-28894) 2021-10-13 23:27:50 +02:00
pystate.c pycore_pystate.h no longer redefines PyThreadState_GET() (GH-28921) 2021-10-13 14:09:13 +02:00
pystrcmp.c bpo-41524: fix pointer bug in PyOS_mystr{n}icmp (GH-21845) 2020-08-27 14:45:25 +09:00
pystrhex.c bpo-45434: pyport.h no longer includes <stdlib.h> (GH-28914) 2021-10-13 19:25:53 +02:00
pystrtod.c bpo-45412: Move _Py_SET_53BIT_PRECISION_START to pycore_pymath.h (GH-28882) 2021-10-11 23:09:40 +02:00
Python-ast.c bpo-11105: Do not crash when compiling recursive ASTs (GH-20594) 2021-06-03 21:01:02 +01:00
Python-tokenize.c bpo-45434: Mark the PyTokenizer C API as private (GH-28924) 2021-10-13 17:22:14 +02:00
pythonrun.c Fix typos in the Python directory (GH-28767) 2021-10-06 15:55:27 -07:00
pytime.c bpo-45412: Remove Py_SET_ERRNO_ON_MATH_ERROR() macro (GH-28820) 2021-10-11 21:00:25 +02:00
README
specialize.c bpo-45367: Specialize BINARY_MULTIPLY (GH-28727) 2021-10-14 15:56:33 +01:00
stdlib_module_names.h bpo-45085: Remove the binhex module (GH-28117) 2021-09-02 12:10:08 +02:00
structmember.c bpo-44655: Include the name of the type in unset __slots__ attribute errors (GH-27199) 2021-07-17 00:34:46 +01:00
suggestions.c bpo-44590: Lazily allocate frame objects (GH-27077) 2021-07-26 11:22:16 +01:00
symtable.c bpo-33346: Allow async comprehensions inside implicit async comprehensions (GH-6766) 2021-07-13 22:27:50 +01:00
sysmodule.c bpo-45439: Move _PyObject_CallNoArgs() to pycore_call.h (GH-28895) 2021-10-12 08:38:19 +02:00
thread.c bpo-44584: Deprecate PYTHONTHREADDEBUG env var (GH-27065) 2021-08-06 13:11:12 +02:00
thread_nt.h bpo-41710: Add private _PyDeadline_Get() function (GH-28674) 2021-10-01 13:29:25 +02:00
thread_pthread.h bpo-41710: Add private _PyDeadline_Get() function (GH-28674) 2021-10-01 13:29:25 +02:00
traceback.c bpo-45434: Mark the PyTokenizer C API as private (GH-28924) 2021-10-13 17:22:14 +02:00
wordcode_helpers.h

Miscellaneous source files for the main Python shared library