gh-129987: Selectively re-enable SLP autovectorization of _PyEval_EvalFrameDefault (#132530)

Only disable SLP autovectorization of `_PyEval_EvalFrameDefault` on newer
GCCs, as the optimization bug seems to exist only on GCC 12 and later, and
before GCC 9 disabling the optimization has a dramatic performance impact.
This commit is contained in:
T. Wouters 2025-04-15 11:39:32 +02:00 committed by GitHub
parent 0879ebc953
commit c66ffcf8e3
No known key found for this signature in database
GPG key ID: B5690EEEBB952194

View file

@ -948,11 +948,15 @@ _PyObjectArray_Free(PyObject **array, PyObject **scratch)
#include "generated_cases.c.h"
#endif
#if (defined(__GNUC__) && !defined(__clang__)) && defined(__x86_64__)
#if (defined(__GNUC__) && __GNUC__ >= 10 && !defined(__clang__)) && defined(__x86_64__)
/*
* gh-129987: The SLP autovectorizer can cause poor code generation for opcode
* dispatch, negating any benefit we get from vectorization elsewhere in the
* interpreter loop.
* gh-129987: The SLP autovectorizer can cause poor code generation for
* opcode dispatch in some GCC versions (observed in GCCs 12 through 15,
* probably caused by https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115777),
* negating any benefit we get from vectorization elsewhere in the
* interpreter loop. Disabling it significantly affected older GCC versions
* (prior to GCC 9, 40% performance drop), so we have to selectively disable
* it.
*/
#define DONT_SLP_VECTORIZE __attribute__((optimize ("no-tree-slp-vectorize")))
#else