mirror of
https://github.com/python/cpython.git
synced 2025-09-26 10:19:53 +00:00
gh-96143: Improve perf profiler docs (#96445)
This commit is contained in:
parent
22863df7ca
commit
723ebe76e7
6 changed files with 118 additions and 50 deletions
|
@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
|
||||||
|
|
||||||
:author: Pablo Galindo
|
:author: Pablo Galindo
|
||||||
|
|
||||||
The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
|
`The Linux perf profiler <https://perf.wiki.kernel.org>`_
|
||||||
obtain information about the performance of your application. ``perf`` also has
|
is a very powerful tool that allows you to profile and obtain
|
||||||
a very vibrant ecosystem of tools that aid with the analysis of the data that it
|
information about the performance of your application.
|
||||||
produces.
|
``perf`` also has a very vibrant ecosystem of tools
|
||||||
|
that aid with the analysis of the data that it produces.
|
||||||
|
|
||||||
The main problem with using the ``perf`` profiler with Python applications is that
|
The main problem with using the ``perf`` profiler with Python applications is that
|
||||||
``perf`` only allows to get information about native symbols, this is, the names of
|
``perf`` only allows to get information about native symbols, this is, the names of
|
||||||
|
@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
|
||||||
relationship between this piece of code and the associated Python function using
|
relationship between this piece of code and the associated Python function using
|
||||||
`perf map files`_.
|
`perf map files`_.
|
||||||
|
|
||||||
.. warning::
|
.. note::
|
||||||
|
|
||||||
Support for the ``perf`` profiler is only currently available for Linux on
|
Support for the ``perf`` profiler is only currently available for Linux on
|
||||||
selected architectures. Check the output of the configure build step or
|
selected architectures. Check the output of the configure build step or
|
||||||
|
@ -51,11 +52,11 @@ For example, consider the following script:
|
||||||
if __name__ == "__main__":
|
if __name__ == "__main__":
|
||||||
baz(1000000)
|
baz(1000000)
|
||||||
|
|
||||||
We can run perf to sample CPU stack traces at 9999 Hertz:
|
We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
|
||||||
|
|
||||||
$ perf record -F 9999 -g -o perf.data python my_script.py
|
$ perf record -F 9999 -g -o perf.data python my_script.py
|
||||||
|
|
||||||
Then we can use perf report to analyze the data:
|
Then we can use ``perf`` report to analyze the data:
|
||||||
|
|
||||||
.. code-block:: shell-session
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
|
||||||
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
|
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
|
||||||
bytecode-evaluating function.
|
bytecode-evaluating function.
|
||||||
|
|
||||||
Instead, if we run the same experiment with perf support activated we get:
|
Instead, if we run the same experiment with ``perf`` support enabled we get:
|
||||||
|
|
||||||
.. code-block:: shell-session
|
.. code-block:: shell-session
|
||||||
|
|
||||||
|
@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Enabling perf profiling mode
|
How to enable ``perf`` profiling support
|
||||||
----------------------------
|
----------------------------------------
|
||||||
|
|
||||||
There are two main ways to activate the perf profiling mode. If you want it to be
|
``perf`` profiling support can either be enabled from the start using
|
||||||
active since the start of the Python interpreter, you can use the ``-Xperf`` option:
|
the environment variable :envvar:`PYTHONPERFSUPPORT` or the
|
||||||
|
:option:`-X perf <-X>` option,
|
||||||
|
or dynamically using :func:`sys.activate_stack_trampoline` and
|
||||||
|
:func:`sys.deactivate_stack_trampoline`.
|
||||||
|
|
||||||
$ python -Xperf my_script.py
|
The :mod:`!sys` functions take precedence over the :option:`!-X` option,
|
||||||
|
the :option:`!-X` option takes precedence over the environment variable.
|
||||||
|
|
||||||
You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
|
Example, using the environment variable::
|
||||||
profiling mode globally.
|
|
||||||
|
|
||||||
There is also support for dynamically activating and deactivating the perf
|
$ PYTHONPERFSUPPORT=1
|
||||||
profiling mode by using the APIs in the :mod:`sys` module:
|
$ python script.py
|
||||||
|
$ perf report -g -i perf.data
|
||||||
|
|
||||||
|
Example, using the :option:`!-X` option::
|
||||||
|
|
||||||
|
$ python -X perf script.py
|
||||||
|
$ perf report -g -i perf.data
|
||||||
|
|
||||||
|
Example, using the :mod:`sys` APIs in file :file:`example.py`:
|
||||||
|
|
||||||
.. code-block:: python
|
.. code-block:: python
|
||||||
|
|
||||||
import sys
|
import sys
|
||||||
sys.activate_stack_trampoline("perf")
|
|
||||||
|
|
||||||
# Run some code with Perf profiling active
|
sys.activate_stack_trampoline("perf")
|
||||||
|
do_profiled_stuff()
|
||||||
|
sys.deactivate_stack_trampoline()
|
||||||
|
|
||||||
sys.deactivate_stack_trampoline()
|
non_profiled_stuff()
|
||||||
|
|
||||||
# Perf profiling is not active anymore
|
...then::
|
||||||
|
|
||||||
These APIs can be handy if you want to activate/deactivate profiling mode in
|
$ python ./example.py
|
||||||
response to a signal or other communication mechanism with your process.
|
$ perf report -g -i perf.data
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
Now we can analyze the data with ``perf report``:
|
|
||||||
|
|
||||||
$ perf report -g -i perf.data
|
|
||||||
|
|
||||||
|
|
||||||
How to obtain the best results
|
How to obtain the best results
|
||||||
-------------------------------
|
------------------------------
|
||||||
|
|
||||||
For the best results, Python should be compiled with
|
For the best results, Python should be compiled with
|
||||||
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
|
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
|
||||||
profilers to unwind using only the frame pointer and not on DWARF debug
|
profilers to unwind using only the frame pointer and not on DWARF debug
|
||||||
information. This is because as the code that is interposed to allow perf
|
information. This is because as the code that is interposed to allow ``perf``
|
||||||
support is dynamically generated it doesn't have any DWARF debugging information
|
support is dynamically generated it doesn't have any DWARF debugging information
|
||||||
available.
|
available.
|
||||||
|
|
||||||
You can check if you system has been compiled with this flag by running:
|
You can check if your system has been compiled with this flag by running::
|
||||||
|
|
||||||
$ python -m sysconfig | grep 'no-omit-frame-pointer'
|
$ python -m sysconfig | grep 'no-omit-frame-pointer'
|
||||||
|
|
||||||
|
|
|
@ -1555,6 +1555,38 @@ always available.
|
||||||
This function has been added on a provisional basis (see :pep:`411`
|
This function has been added on a provisional basis (see :pep:`411`
|
||||||
for details.) Use it only for debugging purposes.
|
for details.) Use it only for debugging purposes.
|
||||||
|
|
||||||
|
.. function:: activate_stack_trampoline(backend, /)
|
||||||
|
|
||||||
|
Activate the stack profiler trampoline *backend*.
|
||||||
|
The only supported backend is ``"perf"``.
|
||||||
|
|
||||||
|
.. availability:: Linux.
|
||||||
|
|
||||||
|
.. versionadded:: 3.12
|
||||||
|
|
||||||
|
.. seealso::
|
||||||
|
|
||||||
|
* :ref:`perf_profiling`
|
||||||
|
* https://perf.wiki.kernel.org
|
||||||
|
|
||||||
|
.. function:: deactivate_stack_trampoline()
|
||||||
|
|
||||||
|
Deactivate the current stack profiler trampoline backend.
|
||||||
|
|
||||||
|
If no stack profiler is activated, this function has no effect.
|
||||||
|
|
||||||
|
.. availability:: Linux.
|
||||||
|
|
||||||
|
.. versionadded:: 3.12
|
||||||
|
|
||||||
|
.. function:: is_stack_trampoline_active()
|
||||||
|
|
||||||
|
Return ``True`` if a stack profiler trampoline is active.
|
||||||
|
|
||||||
|
.. availability:: Linux.
|
||||||
|
|
||||||
|
.. versionadded:: 3.12
|
||||||
|
|
||||||
.. function:: _enablelegacywindowsfsencoding()
|
.. function:: _enablelegacywindowsfsencoding()
|
||||||
|
|
||||||
Changes the :term:`filesystem encoding and error handler` to 'mbcs' and
|
Changes the :term:`filesystem encoding and error handler` to 'mbcs' and
|
||||||
|
|
|
@ -538,12 +538,11 @@ Miscellaneous options
|
||||||
development (running from the source tree) then the default is "off".
|
development (running from the source tree) then the default is "off".
|
||||||
Note that the "importlib_bootstrap" and "importlib_bootstrap_external"
|
Note that the "importlib_bootstrap" and "importlib_bootstrap_external"
|
||||||
frozen modules are always used, even if this flag is set to "off".
|
frozen modules are always used, even if this flag is set to "off".
|
||||||
* ``-X perf`` to activate compatibility mode with the ``perf`` profiler.
|
* ``-X perf`` enables support for the Linux ``perf`` profiler.
|
||||||
When this option is activated, the Linux ``perf`` profiler will be able to
|
When this option is provided, the ``perf`` profiler will be able to
|
||||||
report Python calls. This option is only available on some platforms and
|
report Python calls. This option is only available on some platforms and
|
||||||
will do nothing if is not supported on the current system. The default value
|
will do nothing if is not supported on the current system. The default value
|
||||||
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`
|
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`.
|
||||||
for more information.
|
|
||||||
|
|
||||||
It also allows passing arbitrary values and retrieving them through the
|
It also allows passing arbitrary values and retrieving them through the
|
||||||
:data:`sys._xoptions` dictionary.
|
:data:`sys._xoptions` dictionary.
|
||||||
|
@ -1048,9 +1047,13 @@ conflict.
|
||||||
|
|
||||||
.. envvar:: PYTHONPERFSUPPORT
|
.. envvar:: PYTHONPERFSUPPORT
|
||||||
|
|
||||||
If this variable is set to a nonzero value, it activates compatibility mode
|
If this variable is set to a nonzero value, it enables support for
|
||||||
with the ``perf`` profiler so Python calls can be detected by it. See the
|
the Linux ``perf`` profiler so Python calls can be detected by it.
|
||||||
:ref:`perf_profiling` section for more information.
|
|
||||||
|
If set to ``0``, disable Linux ``perf`` profiler support.
|
||||||
|
|
||||||
|
See also the :option:`-X perf <-X>` command-line option
|
||||||
|
and :ref:`perf_profiling`.
|
||||||
|
|
||||||
.. versionadded:: 3.12
|
.. versionadded:: 3.12
|
||||||
|
|
||||||
|
|
|
@ -74,6 +74,15 @@ Important deprecations, removals or restrictions:
|
||||||
New Features
|
New Features
|
||||||
============
|
============
|
||||||
|
|
||||||
|
* Add :ref:`perf_profiling` through the new
|
||||||
|
environment variable :envvar:`PYTHONPERFSUPPORT`,
|
||||||
|
the new command-line option :option:`-X perf <-X>`,
|
||||||
|
as well as the new :func:`sys.activate_stack_trampoline`,
|
||||||
|
:func:`sys.deactivate_stack_trampoline`,
|
||||||
|
and :func:`sys.is_stack_trampoline_active` APIs.
|
||||||
|
(Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes
|
||||||
|
with contributions from Gregory P. Smith [Google] and Mark Shannon
|
||||||
|
in :gh:`96123`.)
|
||||||
|
|
||||||
|
|
||||||
Other Language Changes
|
Other Language Changes
|
||||||
|
@ -194,6 +203,19 @@ tempfile
|
||||||
The :class:`tempfile.NamedTemporaryFile` function has a new optional parameter
|
The :class:`tempfile.NamedTemporaryFile` function has a new optional parameter
|
||||||
*delete_on_close* (Contributed by Evgeny Zorin in :gh:`58451`.)
|
*delete_on_close* (Contributed by Evgeny Zorin in :gh:`58451`.)
|
||||||
|
|
||||||
|
sys
|
||||||
|
---
|
||||||
|
|
||||||
|
* Add :func:`sys.activate_stack_trampoline` and
|
||||||
|
:func:`sys.deactivate_stack_trampoline` for activating and deactivating
|
||||||
|
stack profiler trampolines,
|
||||||
|
and :func:`sys.is_stack_trampoline_active` for querying if stack profiler
|
||||||
|
trampolines are active.
|
||||||
|
(Contributed by Pablo Galindo and Christian Heimes
|
||||||
|
with contributions from Gregory P. Smith [Google] and Mark Shannon
|
||||||
|
in :gh:`96123`.)
|
||||||
|
|
||||||
|
|
||||||
Optimizations
|
Optimizations
|
||||||
=============
|
=============
|
||||||
|
|
||||||
|
|
10
Python/clinic/sysmodule.c.h
generated
10
Python/clinic/sysmodule.c.h
generated
|
@ -1231,7 +1231,7 @@ PyDoc_STRVAR(sys_activate_stack_trampoline__doc__,
|
||||||
"activate_stack_trampoline($module, backend, /)\n"
|
"activate_stack_trampoline($module, backend, /)\n"
|
||||||
"--\n"
|
"--\n"
|
||||||
"\n"
|
"\n"
|
||||||
"Activate the perf profiler trampoline.");
|
"Activate stack profiler trampoline *backend*.");
|
||||||
|
|
||||||
#define SYS_ACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
#define SYS_ACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
||||||
{"activate_stack_trampoline", (PyCFunction)sys_activate_stack_trampoline, METH_O, sys_activate_stack_trampoline__doc__},
|
{"activate_stack_trampoline", (PyCFunction)sys_activate_stack_trampoline, METH_O, sys_activate_stack_trampoline__doc__},
|
||||||
|
@ -1268,7 +1268,9 @@ PyDoc_STRVAR(sys_deactivate_stack_trampoline__doc__,
|
||||||
"deactivate_stack_trampoline($module, /)\n"
|
"deactivate_stack_trampoline($module, /)\n"
|
||||||
"--\n"
|
"--\n"
|
||||||
"\n"
|
"\n"
|
||||||
"Dectivate the perf profiler trampoline.");
|
"Deactivate the current stack profiler trampoline backend.\n"
|
||||||
|
"\n"
|
||||||
|
"If no stack profiler is activated, this function has no effect.");
|
||||||
|
|
||||||
#define SYS_DEACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
#define SYS_DEACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
||||||
{"deactivate_stack_trampoline", (PyCFunction)sys_deactivate_stack_trampoline, METH_NOARGS, sys_deactivate_stack_trampoline__doc__},
|
{"deactivate_stack_trampoline", (PyCFunction)sys_deactivate_stack_trampoline, METH_NOARGS, sys_deactivate_stack_trampoline__doc__},
|
||||||
|
@ -1286,7 +1288,7 @@ PyDoc_STRVAR(sys_is_stack_trampoline_active__doc__,
|
||||||
"is_stack_trampoline_active($module, /)\n"
|
"is_stack_trampoline_active($module, /)\n"
|
||||||
"--\n"
|
"--\n"
|
||||||
"\n"
|
"\n"
|
||||||
"Returns *True* if the perf profiler trampoline is active.");
|
"Return *True* if a stack profiler trampoline is active.");
|
||||||
|
|
||||||
#define SYS_IS_STACK_TRAMPOLINE_ACTIVE_METHODDEF \
|
#define SYS_IS_STACK_TRAMPOLINE_ACTIVE_METHODDEF \
|
||||||
{"is_stack_trampoline_active", (PyCFunction)sys_is_stack_trampoline_active, METH_NOARGS, sys_is_stack_trampoline_active__doc__},
|
{"is_stack_trampoline_active", (PyCFunction)sys_is_stack_trampoline_active, METH_NOARGS, sys_is_stack_trampoline_active__doc__},
|
||||||
|
@ -1343,4 +1345,4 @@ sys_is_stack_trampoline_active(PyObject *module, PyObject *Py_UNUSED(ignored))
|
||||||
#ifndef SYS_GETANDROIDAPILEVEL_METHODDEF
|
#ifndef SYS_GETANDROIDAPILEVEL_METHODDEF
|
||||||
#define SYS_GETANDROIDAPILEVEL_METHODDEF
|
#define SYS_GETANDROIDAPILEVEL_METHODDEF
|
||||||
#endif /* !defined(SYS_GETANDROIDAPILEVEL_METHODDEF) */
|
#endif /* !defined(SYS_GETANDROIDAPILEVEL_METHODDEF) */
|
||||||
/*[clinic end generated code: output=15318cdd96b62b06 input=a9049054013a1b77]*/
|
/*[clinic end generated code: output=2b5e1bc24a3348bd input=a9049054013a1b77]*/
|
||||||
|
|
|
@ -2127,12 +2127,12 @@ sys.activate_stack_trampoline
|
||||||
backend: str
|
backend: str
|
||||||
/
|
/
|
||||||
|
|
||||||
Activate the perf profiler trampoline.
|
Activate stack profiler trampoline *backend*.
|
||||||
[clinic start generated code]*/
|
[clinic start generated code]*/
|
||||||
|
|
||||||
static PyObject *
|
static PyObject *
|
||||||
sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
|
sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
|
||||||
/*[clinic end generated code: output=5783cdeb51874b43 input=b09020e3a17c78c5]*/
|
/*[clinic end generated code: output=5783cdeb51874b43 input=a12df928758a82b4]*/
|
||||||
{
|
{
|
||||||
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
||||||
if (strcmp(backend, "perf") == 0) {
|
if (strcmp(backend, "perf") == 0) {
|
||||||
|
@ -2163,12 +2163,14 @@ sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
|
||||||
/*[clinic input]
|
/*[clinic input]
|
||||||
sys.deactivate_stack_trampoline
|
sys.deactivate_stack_trampoline
|
||||||
|
|
||||||
Dectivate the perf profiler trampoline.
|
Deactivate the current stack profiler trampoline backend.
|
||||||
|
|
||||||
|
If no stack profiler is activated, this function has no effect.
|
||||||
[clinic start generated code]*/
|
[clinic start generated code]*/
|
||||||
|
|
||||||
static PyObject *
|
static PyObject *
|
||||||
sys_deactivate_stack_trampoline_impl(PyObject *module)
|
sys_deactivate_stack_trampoline_impl(PyObject *module)
|
||||||
/*[clinic end generated code: output=b50da25465df0ef1 input=491f4fc1ed615736]*/
|
/*[clinic end generated code: output=b50da25465df0ef1 input=9f629a6be9fe7fc8]*/
|
||||||
{
|
{
|
||||||
if (_PyPerfTrampoline_Init(0) < 0) {
|
if (_PyPerfTrampoline_Init(0) < 0) {
|
||||||
return NULL;
|
return NULL;
|
||||||
|
@ -2179,12 +2181,12 @@ sys_deactivate_stack_trampoline_impl(PyObject *module)
|
||||||
/*[clinic input]
|
/*[clinic input]
|
||||||
sys.is_stack_trampoline_active
|
sys.is_stack_trampoline_active
|
||||||
|
|
||||||
Returns *True* if the perf profiler trampoline is active.
|
Return *True* if a stack profiler trampoline is active.
|
||||||
[clinic start generated code]*/
|
[clinic start generated code]*/
|
||||||
|
|
||||||
static PyObject *
|
static PyObject *
|
||||||
sys_is_stack_trampoline_active_impl(PyObject *module)
|
sys_is_stack_trampoline_active_impl(PyObject *module)
|
||||||
/*[clinic end generated code: output=ab2746de0ad9d293 input=061fa5776ac9dd59]*/
|
/*[clinic end generated code: output=ab2746de0ad9d293 input=29616b7bf6a0b703]*/
|
||||||
{
|
{
|
||||||
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
||||||
if (_PyIsPerfTrampolineActive()) {
|
if (_PyIsPerfTrampolineActive()) {
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue