.. highlight:: c .. _extension-modules: Defining extension modules -------------------------- A C extension for CPython is a shared library (for example, a ``.so`` file on Linux, ``.pyd`` DLL on Windows), which is loadable into the Python process (for example, it is compiled with compatible compiler settings), and which exports an :ref:`initialization function `. To be importable by default (that is, by :py:class:`importlib.machinery.ExtensionFileLoader`), the shared library must be available on :py:attr:`sys.path`, and must be named after the module name plus an extension listed in :py:attr:`importlib.machinery.EXTENSION_SUFFIXES`. .. note:: Building, packaging and distributing extension modules is best done with third-party tools, and is out of scope of this document. One suitable tool is Setuptools, whose documentation can be found at https://setuptools.pypa.io/en/latest/setuptools.html. Normally, the initialization function returns a module definition initialized using :c:func:`PyModuleDef_Init`. This allows splitting the creation process into several phases: - Before any substantial code is executed, Python can determine which capabilities the module supports, and it can adjust the environment or refuse loading an incompatible extension. - By default, Python itself creates the module object -- that is, it does the equivalent of :py:meth:`object.__new__` for classes. It also sets initial attributes like :attr:`~module.__package__` and :attr:`~module.__loader__`. - Afterwards, the module object is initialized using extension-specific code -- the equivalent of :py:meth:`~object.__init__` on classes. This is called *multi-phase initialization* to distinguish it from the legacy (but still supported) *single-phase initialization* scheme, where the initialization function returns a fully constructed module. See the :ref:`single-phase-initialization section below ` for details. .. versionchanged:: 3.5 Added support for multi-phase initialization (:pep:`489`). Multiple module instances ......................... By default, extension modules are not singletons. For example, if the :py:attr:`sys.modules` entry is removed and the module is re-imported, a new module object is created, and typically populated with fresh method and type objects. The old module is subject to normal garbage collection. This mirrors the behavior of pure-Python modules. Additional module instances may be created in :ref:`sub-interpreters ` or after Python runtime reinitialization (:c:func:`Py_Finalize` and :c:func:`Py_Initialize`). In these cases, sharing Python objects between module instances would likely cause crashes or undefined behavior. To avoid such issues, each instance of an extension module should be *isolated*: changes to one instance should not implicitly affect the others, and all state owned by the module, including references to Python objects, should be specific to a particular module instance. See :ref:`isolating-extensions-howto` for more details and a practical guide. A simpler way to avoid these issues is :ref:`raising an error on repeated initialization `. All modules are expected to support :ref:`sub-interpreters `, or otherwise explicitly signal a lack of support. This is usually achieved by isolation or blocking repeated initialization, as above. A module may also be limited to the main interpreter using the :c:data:`Py_mod_multiple_interpreters` slot. .. _extension-export-hook: Initialization function ....................... The initialization function defined by an extension module has the following signature: .. c:function:: PyObject* PyInit_modulename(void) Its name should be :samp:`PyInit_{}`, with ```` replaced by the name of the module. For modules with ASCII-only names, the function must instead be named :samp:`PyInit_{}`, with ```` replaced by the name of the module. When using :ref:`multi-phase-initialization`, non-ASCII module names are allowed. In this case, the initialization function name is :samp:`PyInitU_{}`, with ```` encoded using Python's *punycode* encoding with hyphens replaced by underscores. In Python: .. code-block:: python def initfunc_name(name): try: suffix = b'_' + name.encode('ascii') except UnicodeEncodeError: suffix = b'U_' + name.encode('punycode').replace(b'-', b'_') return b'PyInit' + suffix It is recommended to define the initialization function using a helper macro: .. c:macro:: PyMODINIT_FUNC Declare an extension module initialization function. This macro: * specifies the :c:expr:`PyObject*` return type, * adds any special linkage declarations required by the platform, and * for C++, declares the function as ``extern "C"``. For example, a module called ``spam`` would be defined like this:: static struct PyModuleDef spam_module = { .m_base = PyModuleDef_HEAD_INIT, .m_name = "spam", ... }; PyMODINIT_FUNC PyInit_spam(void) { return PyModuleDef_Init(&spam_module); } It is possible to export multiple modules from a single shared library by defining multiple initialization functions. However, importing them requires using symbolic links or a custom importer, because by default only the function corresponding to the filename is found. See the `Multiple modules in one library `__ section in :pep:`489` for details. The initialization function is typically the only non-\ ``static`` item defined in the module's C source. .. _multi-phase-initialization: Multi-phase initialization .......................... Normally, the :ref:`initialization function ` (``PyInit_modulename``) returns a :c:type:`PyModuleDef` instance with non-``NULL`` :c:member:`~PyModuleDef.m_slots`. Before it is returned, the ``PyModuleDef`` instance must be initialized using the following function: .. c:function:: PyObject* PyModuleDef_Init(PyModuleDef *def) Ensure a module definition is a properly initialized Python object that correctly reports its type and a reference count. Return *def* cast to ``PyObject*``, or ``NULL`` if an error occurred. Calling this function is required for :ref:`multi-phase-initialization`. It should not be used in other contexts. Note that Python assumes that ``PyModuleDef`` structures are statically allocated. This function may return either a new reference or a borrowed one; this reference must not be released. .. versionadded:: 3.5 .. _single-phase-initialization: Legacy single-phase initialization .................................. .. attention:: Single-phase initialization is a legacy mechanism to initialize extension modules, with known drawbacks and design flaws. Extension module authors are encouraged to use multi-phase initialization instead. In single-phase initialization, the :ref:`initialization function ` (``PyInit_modulename``) should create, populate and return a module object. This is typically done using :c:func:`PyModule_Create` and functions like :c:func:`PyModule_AddObjectRef`. Single-phase initialization differs from the :ref:`default ` in the following ways: * Single-phase modules are, or rather *contain*, “singletons”. When the module is first initialized, Python saves the contents of the module's ``__dict__`` (that is, typically, the module's functions and types). For subsequent imports, Python does not call the initialization function again. Instead, it creates a new module object with a new ``__dict__``, and copies the saved contents to it. For example, given a single-phase module ``_testsinglephase`` [#testsinglephase]_ that defines a function ``sum`` and an exception class ``error``: .. code-block:: python >>> import sys >>> import _testsinglephase as one >>> del sys.modules['_testsinglephase'] >>> import _testsinglephase as two >>> one is two False >>> one.__dict__ is two.__dict__ False >>> one.sum is two.sum True >>> one.error is two.error True The exact behavior should be considered a CPython implementation detail. * To work around the fact that ``PyInit_modulename`` does not take a *spec* argument, some state of the import machinery is saved and applied to the first suitable module created during the ``PyInit_modulename`` call. Specifically, when a sub-module is imported, this mechanism prepends the parent package name to the name of the module. A single-phase ``PyInit_modulename`` function should create “its” module object as soon as possible, before any other module objects can be created. * Non-ASCII module names (``PyInitU_modulename``) are not supported. * Single-phase modules support module lookup functions like :c:func:`PyState_FindModule`. .. [#testsinglephase] ``_testsinglephase`` is an internal module used in CPython's self-test suite; your installation may or may not include it.