mirror of
https://github.com/python/cpython.git
synced 2025-07-07 19:35:27 +00:00
247 lines
9 KiB
ReStructuredText
247 lines
9 KiB
ReStructuredText
.. highlight:: c
|
|
|
|
.. _extension-modules:
|
|
|
|
Defining extension modules
|
|
--------------------------
|
|
|
|
A C extension for CPython is a shared library (for example, a ``.so`` file
|
|
on Linux, ``.pyd`` DLL on Windows), which is loadable into the Python process
|
|
(for example, it is compiled with compatible compiler settings), and which
|
|
exports an :ref:`initialization function <extension-export-hook>`.
|
|
|
|
To be importable by default (that is, by
|
|
:py:class:`importlib.machinery.ExtensionFileLoader`),
|
|
the shared library must be available on :py:attr:`sys.path`,
|
|
and must be named after the module name plus an extension listed in
|
|
:py:attr:`importlib.machinery.EXTENSION_SUFFIXES`.
|
|
|
|
.. note::
|
|
|
|
Building, packaging and distributing extension modules is best done with
|
|
third-party tools, and is out of scope of this document.
|
|
One suitable tool is Setuptools, whose documentation can be found at
|
|
https://setuptools.pypa.io/en/latest/setuptools.html.
|
|
|
|
Normally, the initialization function returns a module definition initialized
|
|
using :c:func:`PyModuleDef_Init`.
|
|
This allows splitting the creation process into several phases:
|
|
|
|
- Before any substantial code is executed, Python can determine which
|
|
capabilities the module supports, and it can adjust the environment or
|
|
refuse loading an incompatible extension.
|
|
- By default, Python itself creates the module object -- that is, it does
|
|
the equivalent of :py:meth:`object.__new__` for classes.
|
|
It also sets initial attributes like :attr:`~module.__package__` and
|
|
:attr:`~module.__loader__`.
|
|
- Afterwards, the module object is initialized using extension-specific
|
|
code -- the equivalent of :py:meth:`~object.__init__` on classes.
|
|
|
|
This is called *multi-phase initialization* to distinguish it from the legacy
|
|
(but still supported) *single-phase initialization* scheme,
|
|
where the initialization function returns a fully constructed module.
|
|
See the :ref:`single-phase-initialization section below <single-phase-initialization>`
|
|
for details.
|
|
|
|
.. versionchanged:: 3.5
|
|
|
|
Added support for multi-phase initialization (:pep:`489`).
|
|
|
|
|
|
Multiple module instances
|
|
.........................
|
|
|
|
By default, extension modules are not singletons.
|
|
For example, if the :py:attr:`sys.modules` entry is removed and the module
|
|
is re-imported, a new module object is created, and typically populated with
|
|
fresh method and type objects.
|
|
The old module is subject to normal garbage collection.
|
|
This mirrors the behavior of pure-Python modules.
|
|
|
|
Additional module instances may be created in
|
|
:ref:`sub-interpreters <sub-interpreter-support>`
|
|
or after Python runtime reinitialization
|
|
(:c:func:`Py_Finalize` and :c:func:`Py_Initialize`).
|
|
In these cases, sharing Python objects between module instances would likely
|
|
cause crashes or undefined behavior.
|
|
|
|
To avoid such issues, each instance of an extension module should
|
|
be *isolated*: changes to one instance should not implicitly affect the others,
|
|
and all state owned by the module, including references to Python objects,
|
|
should be specific to a particular module instance.
|
|
See :ref:`isolating-extensions-howto` for more details and a practical guide.
|
|
|
|
A simpler way to avoid these issues is
|
|
:ref:`raising an error on repeated initialization <isolating-extensions-optout>`.
|
|
|
|
All modules are expected to support
|
|
:ref:`sub-interpreters <sub-interpreter-support>`, or otherwise explicitly
|
|
signal a lack of support.
|
|
This is usually achieved by isolation or blocking repeated initialization,
|
|
as above.
|
|
A module may also be limited to the main interpreter using
|
|
the :c:data:`Py_mod_multiple_interpreters` slot.
|
|
|
|
|
|
.. _extension-export-hook:
|
|
|
|
Initialization function
|
|
.......................
|
|
|
|
The initialization function defined by an extension module has the
|
|
following signature:
|
|
|
|
.. c:function:: PyObject* PyInit_modulename(void)
|
|
|
|
Its name should be :samp:`PyInit_{<name>}`, with ``<name>`` replaced by the
|
|
name of the module.
|
|
|
|
For modules with ASCII-only names, the function must instead be named
|
|
:samp:`PyInit_{<name>}`, with ``<name>`` replaced by the name of the module.
|
|
When using :ref:`multi-phase-initialization`, non-ASCII module names
|
|
are allowed. In this case, the initialization function name is
|
|
:samp:`PyInitU_{<name>}`, with ``<name>`` encoded using Python's
|
|
*punycode* encoding with hyphens replaced by underscores. In Python:
|
|
|
|
.. code-block:: python
|
|
|
|
def initfunc_name(name):
|
|
try:
|
|
suffix = b'_' + name.encode('ascii')
|
|
except UnicodeEncodeError:
|
|
suffix = b'U_' + name.encode('punycode').replace(b'-', b'_')
|
|
return b'PyInit' + suffix
|
|
|
|
It is recommended to define the initialization function using a helper macro:
|
|
|
|
.. c:macro:: PyMODINIT_FUNC
|
|
|
|
Declare an extension module initialization function.
|
|
This macro:
|
|
|
|
* specifies the :c:expr:`PyObject*` return type,
|
|
* adds any special linkage declarations required by the platform, and
|
|
* for C++, declares the function as ``extern "C"``.
|
|
|
|
For example, a module called ``spam`` would be defined like this::
|
|
|
|
static struct PyModuleDef spam_module = {
|
|
.m_base = PyModuleDef_HEAD_INIT,
|
|
.m_name = "spam",
|
|
...
|
|
};
|
|
|
|
PyMODINIT_FUNC
|
|
PyInit_spam(void)
|
|
{
|
|
return PyModuleDef_Init(&spam_module);
|
|
}
|
|
|
|
It is possible to export multiple modules from a single shared library by
|
|
defining multiple initialization functions. However, importing them requires
|
|
using symbolic links or a custom importer, because by default only the
|
|
function corresponding to the filename is found.
|
|
See the `Multiple modules in one library <https://peps.python.org/pep-0489/#multiple-modules-in-one-library>`__
|
|
section in :pep:`489` for details.
|
|
|
|
The initialization function is typically the only non-\ ``static``
|
|
item defined in the module's C source.
|
|
|
|
|
|
.. _multi-phase-initialization:
|
|
|
|
Multi-phase initialization
|
|
..........................
|
|
|
|
Normally, the :ref:`initialization function <extension-export-hook>`
|
|
(``PyInit_modulename``) returns a :c:type:`PyModuleDef` instance with
|
|
non-``NULL`` :c:member:`~PyModuleDef.m_slots`.
|
|
Before it is returned, the ``PyModuleDef`` instance must be initialized
|
|
using the following function:
|
|
|
|
|
|
.. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *def)
|
|
|
|
Ensure a module definition is a properly initialized Python object that
|
|
correctly reports its type and a reference count.
|
|
|
|
Return *def* cast to ``PyObject*``, or ``NULL`` if an error occurred.
|
|
|
|
Calling this function is required for :ref:`multi-phase-initialization`.
|
|
It should not be used in other contexts.
|
|
|
|
Note that Python assumes that ``PyModuleDef`` structures are statically
|
|
allocated.
|
|
This function may return either a new reference or a borrowed one;
|
|
this reference must not be released.
|
|
|
|
.. versionadded:: 3.5
|
|
|
|
|
|
.. _single-phase-initialization:
|
|
|
|
Legacy single-phase initialization
|
|
..................................
|
|
|
|
.. attention::
|
|
Single-phase initialization is a legacy mechanism to initialize extension
|
|
modules, with known drawbacks and design flaws. Extension module authors
|
|
are encouraged to use multi-phase initialization instead.
|
|
|
|
In single-phase initialization, the
|
|
:ref:`initialization function <extension-export-hook>` (``PyInit_modulename``)
|
|
should create, populate and return a module object.
|
|
This is typically done using :c:func:`PyModule_Create` and functions like
|
|
:c:func:`PyModule_AddObjectRef`.
|
|
|
|
Single-phase initialization differs from the :ref:`default <multi-phase-initialization>`
|
|
in the following ways:
|
|
|
|
* Single-phase modules are, or rather *contain*, “singletons”.
|
|
|
|
When the module is first initialized, Python saves the contents of
|
|
the module's ``__dict__`` (that is, typically, the module's functions and
|
|
types).
|
|
|
|
For subsequent imports, Python does not call the initialization function
|
|
again.
|
|
Instead, it creates a new module object with a new ``__dict__``, and copies
|
|
the saved contents to it.
|
|
For example, given a single-phase module ``_testsinglephase``
|
|
[#testsinglephase]_ that defines a function ``sum`` and an exception class
|
|
``error``:
|
|
|
|
.. code-block:: python
|
|
|
|
>>> import sys
|
|
>>> import _testsinglephase as one
|
|
>>> del sys.modules['_testsinglephase']
|
|
>>> import _testsinglephase as two
|
|
>>> one is two
|
|
False
|
|
>>> one.__dict__ is two.__dict__
|
|
False
|
|
>>> one.sum is two.sum
|
|
True
|
|
>>> one.error is two.error
|
|
True
|
|
|
|
The exact behavior should be considered a CPython implementation detail.
|
|
|
|
* To work around the fact that ``PyInit_modulename`` does not take a *spec*
|
|
argument, some state of the import machinery is saved and applied to the
|
|
first suitable module created during the ``PyInit_modulename`` call.
|
|
Specifically, when a sub-module is imported, this mechanism prepends the
|
|
parent package name to the name of the module.
|
|
|
|
A single-phase ``PyInit_modulename`` function should create “its” module
|
|
object as soon as possible, before any other module objects can be created.
|
|
|
|
* Non-ASCII module names (``PyInitU_modulename``) are not supported.
|
|
|
|
* Single-phase modules support module lookup functions like
|
|
:c:func:`PyState_FindModule`.
|
|
|
|
.. [#testsinglephase] ``_testsinglephase`` is an internal module used
|
|
in CPython's self-test suite; your installation may or may not
|
|
include it.
|