mirror of
https://github.com/python/cpython.git
synced 2025-09-26 18:29:57 +00:00
GH-77265: Document NaN handling in statistics functions that sort or count (GH-94676) (#94726)
This commit is contained in:
parent
e5c8ad3e15
commit
f3212b1ec7
1 changed files with 29 additions and 0 deletions
|
@ -35,6 +35,35 @@ and implementation-dependent. If your input data consists of mixed types,
|
||||||
you may be able to use :func:`map` to ensure a consistent result, for
|
you may be able to use :func:`map` to ensure a consistent result, for
|
||||||
example: ``map(float, input_data)``.
|
example: ``map(float, input_data)``.
|
||||||
|
|
||||||
|
Some datasets use ``NaN`` (not a number) values to represent missing data.
|
||||||
|
Since NaNs have unusual comparison semantics, they cause surprising or
|
||||||
|
undefined behaviors in the statistics functions that sort data or that count
|
||||||
|
occurrences. The functions affected are ``median()``, ``median_low()``,
|
||||||
|
``median_high()``, ``median_grouped()``, ``mode()``, ``multimode()``, and
|
||||||
|
``quantiles()``. The ``NaN`` values should be stripped before calling these
|
||||||
|
functions::
|
||||||
|
|
||||||
|
>>> from statistics import median
|
||||||
|
>>> from math import isnan
|
||||||
|
>>> from itertools import filterfalse
|
||||||
|
|
||||||
|
>>> data = [20.7, float('NaN'),19.2, 18.3, float('NaN'), 14.4]
|
||||||
|
>>> sorted(data) # This has surprising behavior
|
||||||
|
[20.7, nan, 14.4, 18.3, 19.2, nan]
|
||||||
|
>>> median(data) # This result is unexpected
|
||||||
|
16.35
|
||||||
|
|
||||||
|
>>> sum(map(isnan, data)) # Number of missing values
|
||||||
|
2
|
||||||
|
>>> clean = list(filterfalse(isnan, data)) # Strip NaN values
|
||||||
|
>>> clean
|
||||||
|
[20.7, 19.2, 18.3, 14.4]
|
||||||
|
>>> sorted(clean) # Sorting now works as expected
|
||||||
|
[14.4, 18.3, 19.2, 20.7]
|
||||||
|
>>> median(clean) # This result is now well defined
|
||||||
|
18.75
|
||||||
|
|
||||||
|
|
||||||
Averages and measures of central location
|
Averages and measures of central location
|
||||||
-----------------------------------------
|
-----------------------------------------
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue