mirror of
https://github.com/python/cpython.git
synced 2025-08-01 07:33:08 +00:00
Various refinements to the NormalDist examples and recipes (GH-12272)
This commit is contained in:
parent
491ef53c15
commit
cc353a0cd9
1 changed files with 26 additions and 23 deletions
|
@ -510,10 +510,9 @@ of applications in statistics.
|
||||||
|
|
||||||
.. classmethod:: NormalDist.from_samples(data)
|
.. classmethod:: NormalDist.from_samples(data)
|
||||||
|
|
||||||
Class method that makes a normal distribution instance
|
Makes a normal distribution instance computed from sample data. The
|
||||||
from sample data. The *data* can be any :term:`iterable`
|
*data* can be any :term:`iterable` and should consist of values that
|
||||||
and should consist of values that can be converted to type
|
can be converted to type :class:`float`.
|
||||||
:class:`float`.
|
|
||||||
|
|
||||||
If *data* does not contain at least two elements, raises
|
If *data* does not contain at least two elements, raises
|
||||||
:exc:`StatisticsError` because it takes at least one point to estimate
|
:exc:`StatisticsError` because it takes at least one point to estimate
|
||||||
|
@ -536,11 +535,10 @@ of applications in statistics.
|
||||||
the given value *x*. Mathematically, it is the ratio ``P(x <= X <
|
the given value *x*. Mathematically, it is the ratio ``P(x <= X <
|
||||||
x+dx) / dx``.
|
x+dx) / dx``.
|
||||||
|
|
||||||
Note the relative likelihood of *x* can be greater than `1.0`. The
|
The relative likelihood is computed as the probability of a sample
|
||||||
probability for a specific point on a continuous distribution is `0.0`,
|
occurring in a narrow range divided by the width of the range (hence
|
||||||
so the :func:`pdf` is used instead. It gives the probability of a
|
the word "density"). Since the likelihood is relative to other points,
|
||||||
sample occurring in a narrow range around *x* and then dividing that
|
its value can be greater than `1.0`.
|
||||||
probability by the width of the range (hence the word "density").
|
|
||||||
|
|
||||||
.. method:: NormalDist.cdf(x)
|
.. method:: NormalDist.cdf(x)
|
||||||
|
|
||||||
|
@ -568,7 +566,8 @@ of applications in statistics.
|
||||||
>>> temperature_february * (9/5) + 32 # Fahrenheit
|
>>> temperature_february * (9/5) + 32 # Fahrenheit
|
||||||
NormalDist(mu=41.0, sigma=4.5)
|
NormalDist(mu=41.0, sigma=4.5)
|
||||||
|
|
||||||
Dividing a constant by an instance of :class:`NormalDist` is not supported.
|
Dividing a constant by an instance of :class:`NormalDist` is not supported
|
||||||
|
because the result wouldn't be normally distributed.
|
||||||
|
|
||||||
Since normal distributions arise from additive effects of independent
|
Since normal distributions arise from additive effects of independent
|
||||||
variables, it is possible to `add and subtract two independent normally
|
variables, it is possible to `add and subtract two independent normally
|
||||||
|
@ -581,8 +580,10 @@ of applications in statistics.
|
||||||
>>> birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
|
>>> birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
|
||||||
>>> drug_effects = NormalDist(0.4, 0.15)
|
>>> drug_effects = NormalDist(0.4, 0.15)
|
||||||
>>> combined = birth_weights + drug_effects
|
>>> combined = birth_weights + drug_effects
|
||||||
>>> f'mean: {combined.mean :.1f} standard deviation: {combined.stdev :.1f}'
|
>>> round(combined.mean, 1)
|
||||||
'mean: 3.1 standard deviation: 0.5'
|
3.1
|
||||||
|
>>> round(combined.stdev, 1)
|
||||||
|
0.5
|
||||||
|
|
||||||
.. versionadded:: 3.8
|
.. versionadded:: 3.8
|
||||||
|
|
||||||
|
@ -595,14 +596,15 @@ of applications in statistics.
|
||||||
For example, given `historical data for SAT exams
|
For example, given `historical data for SAT exams
|
||||||
<https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores
|
<https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores
|
||||||
are normally distributed with a mean of 1060 and a standard deviation of 192,
|
are normally distributed with a mean of 1060 and a standard deviation of 192,
|
||||||
determine the percentage of students with scores between 1100 and 1200:
|
determine the percentage of students with scores between 1100 and 1200, after
|
||||||
|
rounding to the nearest whole number:
|
||||||
|
|
||||||
.. doctest::
|
.. doctest::
|
||||||
|
|
||||||
>>> sat = NormalDist(1060, 195)
|
>>> sat = NormalDist(1060, 195)
|
||||||
>>> fraction = sat.cdf(1200 + 0.5) - sat.cdf(1100 - 0.5)
|
>>> fraction = sat.cdf(1200 + 0.5) - sat.cdf(1100 - 0.5)
|
||||||
>>> f'{fraction * 100 :.1f}% score between 1100 and 1200'
|
>>> round(fraction * 100.0, 1)
|
||||||
'18.4% score between 1100 and 1200'
|
18.4
|
||||||
|
|
||||||
What percentage of men and women will have the same height in `two normally
|
What percentage of men and women will have the same height in `two normally
|
||||||
distributed populations with known means and standard deviations
|
distributed populations with known means and standard deviations
|
||||||
|
@ -616,18 +618,19 @@ distributed populations with known means and standard deviations
|
||||||
|
|
||||||
To estimate the distribution for a model than isn't easy to solve
|
To estimate the distribution for a model than isn't easy to solve
|
||||||
analytically, :class:`NormalDist` can generate input samples for a `Monte
|
analytically, :class:`NormalDist` can generate input samples for a `Monte
|
||||||
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_ of the
|
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
|
||||||
model:
|
|
||||||
|
|
||||||
.. doctest::
|
.. doctest::
|
||||||
|
|
||||||
|
>>> def model(x, y, z):
|
||||||
|
... return (3*x + 7*x*y - 5*y) / (11 * z)
|
||||||
|
...
|
||||||
>>> n = 100_000
|
>>> n = 100_000
|
||||||
>>> X = NormalDist(350, 15).samples(n)
|
>>> X = NormalDist(10, 2.5).samples(n)
|
||||||
>>> Y = NormalDist(47, 17).samples(n)
|
>>> Y = NormalDist(15, 1.75).samples(n)
|
||||||
>>> Z = NormalDist(62, 6).samples(n)
|
>>> Z = NormalDist(5, 1.25).samples(n)
|
||||||
>>> model_simulation = [x * y / z for x, y, z in zip(X, Y, Z)]
|
>>> NormalDist.from_samples(map(model, X, Y, Z)) # doctest: +SKIP
|
||||||
>>> NormalDist.from_samples(model_simulation) # doctest: +SKIP
|
NormalDist(mu=19.640137307085507, sigma=47.03273142191088)
|
||||||
NormalDist(mu=267.6516398754636, sigma=101.357284306067)
|
|
||||||
|
|
||||||
Normal distributions commonly arise in machine learning problems.
|
Normal distributions commonly arise in machine learning problems.
|
||||||
|
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue