mirror of
https://github.com/python/cpython.git
synced 2025-11-02 03:01:58 +00:00
gh-96143: Improve perf profiler docs (#96445)
This commit is contained in:
parent
22863df7ca
commit
723ebe76e7
6 changed files with 118 additions and 50 deletions
|
|
@ -8,10 +8,11 @@ Python support for the Linux ``perf`` profiler
|
|||
|
||||
:author: Pablo Galindo
|
||||
|
||||
The Linux ``perf`` profiler is a very powerful tool that allows you to profile and
|
||||
obtain information about the performance of your application. ``perf`` also has
|
||||
a very vibrant ecosystem of tools that aid with the analysis of the data that it
|
||||
produces.
|
||||
`The Linux perf profiler <https://perf.wiki.kernel.org>`_
|
||||
is a very powerful tool that allows you to profile and obtain
|
||||
information about the performance of your application.
|
||||
``perf`` also has a very vibrant ecosystem of tools
|
||||
that aid with the analysis of the data that it produces.
|
||||
|
||||
The main problem with using the ``perf`` profiler with Python applications is that
|
||||
``perf`` only allows to get information about native symbols, this is, the names of
|
||||
|
|
@ -25,7 +26,7 @@ fly before the execution of every Python function and it will teach ``perf`` the
|
|||
relationship between this piece of code and the associated Python function using
|
||||
`perf map files`_.
|
||||
|
||||
.. warning::
|
||||
.. note::
|
||||
|
||||
Support for the ``perf`` profiler is only currently available for Linux on
|
||||
selected architectures. Check the output of the configure build step or
|
||||
|
|
@ -51,11 +52,11 @@ For example, consider the following script:
|
|||
if __name__ == "__main__":
|
||||
baz(1000000)
|
||||
|
||||
We can run perf to sample CPU stack traces at 9999 Hertz:
|
||||
We can run ``perf`` to sample CPU stack traces at 9999 Hertz::
|
||||
|
||||
$ perf record -F 9999 -g -o perf.data python my_script.py
|
||||
|
||||
Then we can use perf report to analyze the data:
|
||||
Then we can use ``perf`` report to analyze the data:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
|
|
@ -101,7 +102,7 @@ As you can see here, the Python functions are not shown in the output, only ``_P
|
|||
functions use the same C function to evaluate bytecode so we cannot know which Python function corresponds to which
|
||||
bytecode-evaluating function.
|
||||
|
||||
Instead, if we run the same experiment with perf support activated we get:
|
||||
Instead, if we run the same experiment with ``perf`` support enabled we get:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
|
|
@ -147,52 +148,58 @@ Instead, if we run the same experiment with perf support activated we get:
|
|||
|
||||
|
||||
|
||||
Enabling perf profiling mode
|
||||
----------------------------
|
||||
How to enable ``perf`` profiling support
|
||||
----------------------------------------
|
||||
|
||||
There are two main ways to activate the perf profiling mode. If you want it to be
|
||||
active since the start of the Python interpreter, you can use the ``-Xperf`` option:
|
||||
``perf`` profiling support can either be enabled from the start using
|
||||
the environment variable :envvar:`PYTHONPERFSUPPORT` or the
|
||||
:option:`-X perf <-X>` option,
|
||||
or dynamically using :func:`sys.activate_stack_trampoline` and
|
||||
:func:`sys.deactivate_stack_trampoline`.
|
||||
|
||||
$ python -Xperf my_script.py
|
||||
The :mod:`!sys` functions take precedence over the :option:`!-X` option,
|
||||
the :option:`!-X` option takes precedence over the environment variable.
|
||||
|
||||
You can also set the :envvar:`PYTHONPERFSUPPORT` to a nonzero value to actiavate perf
|
||||
profiling mode globally.
|
||||
Example, using the environment variable::
|
||||
|
||||
There is also support for dynamically activating and deactivating the perf
|
||||
profiling mode by using the APIs in the :mod:`sys` module:
|
||||
$ PYTHONPERFSUPPORT=1
|
||||
$ python script.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
Example, using the :option:`!-X` option::
|
||||
|
||||
$ python -X perf script.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
Example, using the :mod:`sys` APIs in file :file:`example.py`:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
import sys
|
||||
sys.activate_stack_trampoline("perf")
|
||||
import sys
|
||||
|
||||
# Run some code with Perf profiling active
|
||||
sys.activate_stack_trampoline("perf")
|
||||
do_profiled_stuff()
|
||||
sys.deactivate_stack_trampoline()
|
||||
|
||||
sys.deactivate_stack_trampoline()
|
||||
non_profiled_stuff()
|
||||
|
||||
# Perf profiling is not active anymore
|
||||
...then::
|
||||
|
||||
These APIs can be handy if you want to activate/deactivate profiling mode in
|
||||
response to a signal or other communication mechanism with your process.
|
||||
|
||||
|
||||
|
||||
Now we can analyze the data with ``perf report``:
|
||||
|
||||
$ perf report -g -i perf.data
|
||||
$ python ./example.py
|
||||
$ perf report -g -i perf.data
|
||||
|
||||
|
||||
How to obtain the best results
|
||||
-------------------------------
|
||||
------------------------------
|
||||
|
||||
For the best results, Python should be compiled with
|
||||
``CFLAGS="-fno-omit-frame-pointer -mno-omit-leaf-frame-pointer"`` as this allows
|
||||
profilers to unwind using only the frame pointer and not on DWARF debug
|
||||
information. This is because as the code that is interposed to allow perf
|
||||
information. This is because as the code that is interposed to allow ``perf``
|
||||
support is dynamically generated it doesn't have any DWARF debugging information
|
||||
available.
|
||||
|
||||
You can check if you system has been compiled with this flag by running:
|
||||
You can check if your system has been compiled with this flag by running::
|
||||
|
||||
$ python -m sysconfig | grep 'no-omit-frame-pointer'
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue