mirror of
https://github.com/python/cpython.git
synced 2025-09-26 10:19:53 +00:00
gh-96143: Improve perf profiler docs (#96445)
This commit is contained in:
parent
22863df7ca
commit
723ebe76e7
6 changed files with 118 additions and 50 deletions
|
@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
|
|||
|
||||
:author: Pablo Galindo
|
||||
|
||||
The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
|
||||
obtain information about the performance of your application. ``perf`` also has
|
||||
a very vibrant ecosystem of tools that aid with the analysis of the data that it
|
||||
produces.
|
||||
`The Linux perf profiler <https://perf.wiki.kernel.org>`_
|
||||
is a very powerful tool that allows you to profile and obtain
|
||||
information about the performance of your application.
|
||||
``perf`` also has a very vibrant ecosystem of tools
|
||||
that aid with the analysis of the data that it produces.
|
||||
|
||||
The main problem with using the ``perf`` profiler with Python applications is that
|
||||
``perf`` only allows to get information about native symbols, this is, the names of
|
||||
|
@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
|
|||
relationship between this piece of code and the associated Python function using
|
||||
`perf map files`_.
|
||||
|
||||
.. warning::
|
||||
.. note::
|
||||
|
||||
Support for the ``perf`` profiler is only currently available for Linux on
|
||||
selected architectures. Check the output of the configure build step or
|
||||
|
@ -51,11 +52,11 @@ For example, consider the following script:
|
|||
if __name__ == "__main__":
|
||||
baz(1000000)
|
||||
|
||||
We can run perf to sample CPU stack traces at 9999 Hertz:
|
||||
We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
|
||||
|
||||
$ perf record -F 9999 -g -o perf.data python my_script.py
|
||||
|
||||
Then we can use perf report to analyze the data:
|
||||
Then we can use ``perf`` report to analyze the data:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
|
@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
|
|||
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
|
||||
bytecode-evaluating function.
|
||||
|
||||
Instead, if we run the same experiment with perf support activated we get:
|
||||
Instead, if we run the same experiment with ``perf`` support enabled we get:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
|
@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
|
|||
|
||||
|
||||
|
||||
Enabling perf profiling mode
|
||||
----------------------------
|
||||
How to enable ``perf`` profiling support
|
||||
----------------------------------------
|
||||
|
||||
There are two main ways to activate the perf profiling mode. If you want it to be
|
||||
active since the start of the Python interpreter, you can use the ``-Xperf`` option:
|
||||
``perf`` profiling support can either be enabled from the start using
|
||||
the environment variable :envvar:`PYTHONPERFSUPPORT` or the
|
||||
:option:`-X perf <-X>` option,
|
||||
or dynamically using :func:`sys.activate_stack_trampoline` and
|
||||
:func:`sys.deactivate_stack_trampoline`.
|
||||
|
||||
$ python -Xperf my_script.py
|
||||
The :mod:`!sys` functions take precedence over the :option:`!-X` option,
|
||||
the :option:`!-X` option takes precedence over the environment variable.
|
||||
|
||||
You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
|
||||
profiling mode globally.
|
||||
Example, using the environment variable::
|
||||
|
||||
There is also support for dynamically activating and deactivating the perf
|
||||
profiling mode by using the APIs in the :mod:`sys` module:
|
||||
$ PYTHONPERFSUPPORT=1
|
||||
$ python script.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
Example, using the :option:`!-X` option::
|
||||
|
||||
$ python -X perf script.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
Example, using the :mod:`sys` APIs in file :file:`example.py`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import sys
|
||||
sys.activate_stack_trampoline("perf")
|
||||
import sys
|
||||
|
||||
# Run some code with Perf profiling active
|
||||
sys.activate_stack_trampoline("perf")
|
||||
do_profiled_stuff()
|
||||
sys.deactivate_stack_trampoline()
|
||||
|
||||
sys.deactivate_stack_trampoline()
|
||||
non_profiled_stuff()
|
||||
|
||||
# Perf profiling is not active anymore
|
||||
...then::
|
||||
|
||||
These APIs can be handy if you want to activate/deactivate profiling mode in
|
||||
response to a signal or other communication mechanism with your process.
|
||||
|
||||
|
||||
|
||||
Now we can analyze the data with ``perf report``:
|
||||
|
||||
$ perf report -g -i perf.data
|
||||
$ python ./example.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
|
||||
How to obtain the best results
|
||||
-------------------------------
|
||||
------------------------------
|
||||
|
||||
For the best results, Python should be compiled with
|
||||
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
|
||||
profilers to unwind using only the frame pointer and not on DWARF debug
|
||||
information. This is because as the code that is interposed to allow perf
|
||||
information. This is because as the code that is interposed to allow ``perf``
|
||||
support is dynamically generated it doesn't have any DWARF debugging information
|
||||
available.
|
||||
|
||||
You can check if you system has been compiled with this flag by running:
|
||||
You can check if your system has been compiled with this flag by running::
|
||||
|
||||
$ python -m sysconfig | grep 'no-omit-frame-pointer'
|
||||
|
||||
|
|
|
@ -1555,6 +1555,38 @@ always available.
|
|||
This function has been added on a provisional basis (see :pep:`411`
|
||||
for details.) Use it only for debugging purposes.
|
||||
|
||||
.. function:: activate_stack_trampoline(backend, /)
|
||||
|
||||
Activate the stack profiler trampoline *backend*.
|
||||
The only supported backend is ``"perf"``.
|
||||
|
||||
.. availability:: Linux.
|
||||
|
||||
.. versionadded:: 3.12
|
||||
|
||||
.. seealso::
|
||||
|
||||
* :ref:`perf_profiling`
|
||||
* https://perf.wiki.kernel.org
|
||||
|
||||
.. function:: deactivate_stack_trampoline()
|
||||
|
||||
Deactivate the current stack profiler trampoline backend.
|
||||
|
||||
If no stack profiler is activated, this function has no effect.
|
||||
|
||||
.. availability:: Linux.
|
||||
|
||||
.. versionadded:: 3.12
|
||||
|
||||
.. function:: is_stack_trampoline_active()
|
||||
|
||||
Return ``True`` if a stack profiler trampoline is active.
|
||||
|
||||
.. availability:: Linux.
|
||||
|
||||
.. versionadded:: 3.12
|
||||
|
||||
.. function:: _enablelegacywindowsfsencoding()
|
||||
|
||||
Changes the :term:`filesystem encoding and error handler` to 'mbcs' and
|
||||
|
|
|
@ -538,12 +538,11 @@ Miscellaneous options
|
|||
development (running from the source tree) then the default is "off".
|
||||
Note that the "importlib_bootstrap" and "importlib_bootstrap_external"
|
||||
frozen modules are always used, even if this flag is set to "off".
|
||||
* ``-X perf`` to activate compatibility mode with the ``perf`` profiler.
|
||||
When this option is activated, the Linux ``perf`` profiler will be able to
|
||||
* ``-X perf`` enables support for the Linux ``perf`` profiler.
|
||||
When this option is provided, the ``perf`` profiler will be able to
|
||||
report Python calls. This option is only available on some platforms and
|
||||
will do nothing if is not supported on the current system. The default value
|
||||
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`
|
||||
for more information.
|
||||
is "off". See also :envvar:`PYTHONPERFSUPPORT` and :ref:`perf_profiling`.
|
||||
|
||||
It also allows passing arbitrary values and retrieving them through the
|
||||
:data:`sys._xoptions` dictionary.
|
||||
|
@ -1048,9 +1047,13 @@ conflict.
|
|||
|
||||
.. envvar:: PYTHONPERFSUPPORT
|
||||
|
||||
If this variable is set to a nonzero value, it activates compatibility mode
|
||||
with the ``perf`` profiler so Python calls can be detected by it. See the
|
||||
:ref:`perf_profiling` section for more information.
|
||||
If this variable is set to a nonzero value, it enables support for
|
||||
the Linux ``perf`` profiler so Python calls can be detected by it.
|
||||
|
||||
If set to ``0``, disable Linux ``perf`` profiler support.
|
||||
|
||||
See also the :option:`-X perf <-X>` command-line option
|
||||
and :ref:`perf_profiling`.
|
||||
|
||||
.. versionadded:: 3.12
|
||||
|
||||
|
|
|
@ -74,6 +74,15 @@ Important deprecations, removals or restrictions:
|
|||
New Features
|
||||
============
|
||||
|
||||
* Add :ref:`perf_profiling` through the new
|
||||
environment variable :envvar:`PYTHONPERFSUPPORT`,
|
||||
the new command-line option :option:`-X perf <-X>`,
|
||||
as well as the new :func:`sys.activate_stack_trampoline`,
|
||||
:func:`sys.deactivate_stack_trampoline`,
|
||||
and :func:`sys.is_stack_trampoline_active` APIs.
|
||||
(Design by Pablo Galindo. Contributed by Pablo Galindo and Christian Heimes
|
||||
with contributions from Gregory P. Smith [Google] and Mark Shannon
|
||||
in :gh:`96123`.)
|
||||
|
||||
|
||||
Other Language Changes
|
||||
|
@ -194,6 +203,19 @@ tempfile
|
|||
The :class:`tempfile.NamedTemporaryFile` function has a new optional parameter
|
||||
*delete_on_close* (Contributed by Evgeny Zorin in :gh:`58451`.)
|
||||
|
||||
sys
|
||||
---
|
||||
|
||||
* Add :func:`sys.activate_stack_trampoline` and
|
||||
:func:`sys.deactivate_stack_trampoline` for activating and deactivating
|
||||
stack profiler trampolines,
|
||||
and :func:`sys.is_stack_trampoline_active` for querying if stack profiler
|
||||
trampolines are active.
|
||||
(Contributed by Pablo Galindo and Christian Heimes
|
||||
with contributions from Gregory P. Smith [Google] and Mark Shannon
|
||||
in :gh:`96123`.)
|
||||
|
||||
|
||||
Optimizations
|
||||
=============
|
||||
|
||||
|
|
10
Python/clinic/sysmodule.c.h
generated
10
Python/clinic/sysmodule.c.h
generated
|
@ -1231,7 +1231,7 @@ PyDoc_STRVAR(sys_activate_stack_trampoline__doc__,
|
|||
"activate_stack_trampoline($module, backend, /)\n"
|
||||
"--\n"
|
||||
"\n"
|
||||
"Activate the perf profiler trampoline.");
|
||||
"Activate stack profiler trampoline *backend*.");
|
||||
|
||||
#define SYS_ACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
||||
{"activate_stack_trampoline", (PyCFunction)sys_activate_stack_trampoline, METH_O, sys_activate_stack_trampoline__doc__},
|
||||
|
@ -1268,7 +1268,9 @@ PyDoc_STRVAR(sys_deactivate_stack_trampoline__doc__,
|
|||
"deactivate_stack_trampoline($module, /)\n"
|
||||
"--\n"
|
||||
"\n"
|
||||
"Dectivate the perf profiler trampoline.");
|
||||
"Deactivate the current stack profiler trampoline backend.\n"
|
||||
"\n"
|
||||
"If no stack profiler is activated, this function has no effect.");
|
||||
|
||||
#define SYS_DEACTIVATE_STACK_TRAMPOLINE_METHODDEF \
|
||||
{"deactivate_stack_trampoline", (PyCFunction)sys_deactivate_stack_trampoline, METH_NOARGS, sys_deactivate_stack_trampoline__doc__},
|
||||
|
@ -1286,7 +1288,7 @@ PyDoc_STRVAR(sys_is_stack_trampoline_active__doc__,
|
|||
"is_stack_trampoline_active($module, /)\n"
|
||||
"--\n"
|
||||
"\n"
|
||||
"Returns *True* if the perf profiler trampoline is active.");
|
||||
"Return *True* if a stack profiler trampoline is active.");
|
||||
|
||||
#define SYS_IS_STACK_TRAMPOLINE_ACTIVE_METHODDEF \
|
||||
{"is_stack_trampoline_active", (PyCFunction)sys_is_stack_trampoline_active, METH_NOARGS, sys_is_stack_trampoline_active__doc__},
|
||||
|
@ -1343,4 +1345,4 @@ sys_is_stack_trampoline_active(PyObject *module, PyObject *Py_UNUSED(ignored))
|
|||
#ifndef SYS_GETANDROIDAPILEVEL_METHODDEF
|
||||
#define SYS_GETANDROIDAPILEVEL_METHODDEF
|
||||
#endif /* !defined(SYS_GETANDROIDAPILEVEL_METHODDEF) */
|
||||
/*[clinic end generated code: output=15318cdd96b62b06 input=a9049054013a1b77]*/
|
||||
/*[clinic end generated code: output=2b5e1bc24a3348bd input=a9049054013a1b77]*/
|
||||
|
|
|
@ -2127,12 +2127,12 @@ sys.activate_stack_trampoline
|
|||
backend: str
|
||||
/
|
||||
|
||||
Activate the perf profiler trampoline.
|
||||
Activate stack profiler trampoline *backend*.
|
||||
[clinic start generated code]*/
|
||||
|
||||
static PyObject *
|
||||
sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
|
||||
/*[clinic end generated code: output=5783cdeb51874b43 input=b09020e3a17c78c5]*/
|
||||
/*[clinic end generated code: output=5783cdeb51874b43 input=a12df928758a82b4]*/
|
||||
{
|
||||
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
||||
if (strcmp(backend, "perf") == 0) {
|
||||
|
@ -2163,12 +2163,14 @@ sys_activate_stack_trampoline_impl(PyObject *module, const char *backend)
|
|||
/*[clinic input]
|
||||
sys.deactivate_stack_trampoline
|
||||
|
||||
Dectivate the perf profiler trampoline.
|
||||
Deactivate the current stack profiler trampoline backend.
|
||||
|
||||
If no stack profiler is activated, this function has no effect.
|
||||
[clinic start generated code]*/
|
||||
|
||||
static PyObject *
|
||||
sys_deactivate_stack_trampoline_impl(PyObject *module)
|
||||
/*[clinic end generated code: output=b50da25465df0ef1 input=491f4fc1ed615736]*/
|
||||
/*[clinic end generated code: output=b50da25465df0ef1 input=9f629a6be9fe7fc8]*/
|
||||
{
|
||||
if (_PyPerfTrampoline_Init(0) < 0) {
|
||||
return NULL;
|
||||
|
@ -2179,12 +2181,12 @@ sys_deactivate_stack_trampoline_impl(PyObject *module)
|
|||
/*[clinic input]
|
||||
sys.is_stack_trampoline_active
|
||||
|
||||
Returns *True* if the perf profiler trampoline is active.
|
||||
Return *True* if a stack profiler trampoline is active.
|
||||
[clinic start generated code]*/
|
||||
|
||||
static PyObject *
|
||||
sys_is_stack_trampoline_active_impl(PyObject *module)
|
||||
/*[clinic end generated code: output=ab2746de0ad9d293 input=061fa5776ac9dd59]*/
|
||||
/*[clinic end generated code: output=ab2746de0ad9d293 input=29616b7bf6a0b703]*/
|
||||
{
|
||||
#ifdef PY_HAVE_PERF_TRAMPOLINE
|
||||
if (_PyIsPerfTrampolineActive()) {
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue