mirror of
https://github.com/python/cpython.git
synced 2025-09-26 18:29:57 +00:00
gh-100833: Remove 'volatile' qualifiers in fsum algorithm (#100845)
This PR removes the `volatile` qualifier on various intermediate quantities in the `math.fsum` implementation, and updates the notes preceding the algorithm accordingly (as well as fixing some of the exsting notes). See the linked issue #100833 for discussion.
This commit is contained in:
parent
b139bcd892
commit
87d3bd0e02
2 changed files with 22 additions and 21 deletions
|
@ -0,0 +1 @@
|
||||||
|
Speed up :func:`math.fsum` by removing defensive ``volatile`` qualifiers.
|
|
@ -1358,30 +1358,30 @@ FUNC1(tanh, tanh, 0,
|
||||||
Dickinson's post at <http://bugs.python.org/file10357/msum4.py>.
|
Dickinson's post at <http://bugs.python.org/file10357/msum4.py>.
|
||||||
See those links for more details, proofs and other references.
|
See those links for more details, proofs and other references.
|
||||||
|
|
||||||
Note 1: IEEE 754R floating point semantics are assumed,
|
Note 1: IEEE 754 floating-point semantics with a rounding mode of
|
||||||
but the current implementation does not re-establish special
|
roundTiesToEven are assumed.
|
||||||
value semantics across iterations (i.e. handling -Inf + Inf).
|
|
||||||
|
|
||||||
Note 2: No provision is made for intermediate overflow handling;
|
Note 2: No provision is made for intermediate overflow handling;
|
||||||
therefore, sum([1e+308, 1e-308, 1e+308]) returns 1e+308 while
|
therefore, fsum([1e+308, -1e+308, 1e+308]) returns 1e+308 while
|
||||||
sum([1e+308, 1e+308, 1e-308]) raises an OverflowError due to the
|
fsum([1e+308, 1e+308, -1e+308]) raises an OverflowError due to the
|
||||||
overflow of the first partial sum.
|
overflow of the first partial sum.
|
||||||
|
|
||||||
Note 3: The intermediate values lo, yr, and hi are declared volatile so
|
Note 3: The algorithm has two potential sources of fragility. First, C
|
||||||
aggressive compilers won't algebraically reduce lo to always be exactly 0.0.
|
permits arithmetic operations on `double`s to be performed in an
|
||||||
Also, the volatile declaration forces the values to be stored in memory as
|
intermediate format whose range and precision may be greater than those of
|
||||||
regular doubles instead of extended long precision (80-bit) values. This
|
`double` (see for example C99 §5.2.4.2.2, paragraph 8). This can happen for
|
||||||
prevents double rounding because any addition or subtraction of two doubles
|
example on machines using the now largely historical x87 FPUs. In this case,
|
||||||
can be resolved exactly into double-sized hi and lo values. As long as the
|
`fsum` can produce incorrect results. If `FLT_EVAL_METHOD` is `0` or `1`, or
|
||||||
hi value gets forced into a double before yr and lo are computed, the extra
|
`FLT_EVAL_METHOD` is `2` and `long double` is identical to `double`, then we
|
||||||
bits in downstream extended precision operations (x87 for example) will be
|
should be safe from this source of errors. Second, an aggressively
|
||||||
exactly zero and therefore can be losslessly stored back into a double,
|
optimizing compiler can re-associate operations so that (for example) the
|
||||||
thereby preventing double rounding.
|
statement `yr = hi - x;` is treated as `yr = (x + y) - x` and then
|
||||||
|
re-associated as `yr = y + (x - x)`, giving `y = yr` and `lo = 0.0`. That
|
||||||
|
re-association would be in violation of the C standard, and should not occur
|
||||||
|
except possibly in the presence of unsafe optimizations (e.g., -ffast-math,
|
||||||
|
-fassociative-math). Such optimizations should be avoided for this module.
|
||||||
|
|
||||||
Note 4: A similar implementation is in Modules/cmathmodule.c.
|
Note 4: The signature of math.fsum() differs from builtins.sum()
|
||||||
Be sure to update both when making changes.
|
|
||||||
|
|
||||||
Note 5: The signature of math.fsum() differs from builtins.sum()
|
|
||||||
because the start argument doesn't make sense in the context of
|
because the start argument doesn't make sense in the context of
|
||||||
accurate summation. Since the partials table is collapsed before
|
accurate summation. Since the partials table is collapsed before
|
||||||
returning a result, sum(seq2, start=sum(seq1)) may not equal the
|
returning a result, sum(seq2, start=sum(seq1)) may not equal the
|
||||||
|
@ -1467,7 +1467,7 @@ math_fsum(PyObject *module, PyObject *seq)
|
||||||
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
|
Py_ssize_t i, j, n = 0, m = NUM_PARTIALS;
|
||||||
double x, y, t, ps[NUM_PARTIALS], *p = ps;
|
double x, y, t, ps[NUM_PARTIALS], *p = ps;
|
||||||
double xsave, special_sum = 0.0, inf_sum = 0.0;
|
double xsave, special_sum = 0.0, inf_sum = 0.0;
|
||||||
volatile double hi, yr, lo;
|
double hi, yr, lo;
|
||||||
|
|
||||||
iter = PyObject_GetIter(seq);
|
iter = PyObject_GetIter(seq);
|
||||||
if (iter == NULL)
|
if (iter == NULL)
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue