[3.12] Add another example to the statistics docs (GH-107904) (#107941)

Add another example to the statistics docs (GH-107904) (cherry picked from commit 2b6dc2accc) Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>
2025-09-25 17:59:57 +00:00 · 2023-08-14 07:58:54 -07:00 · 2023-08-14 07:58:54 -07:00 · e8963a86ea
commit e8963a86ea
parent ba8ab4e8ae
2 changed files with 57 additions and 0 deletions
--- a/Doc/library/kde_example.png
+++ b/Doc/library/kde_example.png
--- a/Doc/library/statistics.rst
+++ b/Doc/library/statistics.rst
@ -922,6 +922,10 @@ of applications in statistics.
 :class:`NormalDist` Examples and Recipes
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 Classic probability problems
 ****************************
 :class:`NormalDist` readily solves classic probability problems.
 For example, given `historical data for SAT exams
@ -947,6 +951,10 @@ Find the `quartiles <https://en.wikipedia.org/wiki/Quartile>`_ and `deciles
    >>> list(map(round, sat.quantiles(n=10)))
    [810, 896, 958, 1011, 1060, 1109, 1162, 1224, 1310]
 Monte Carlo inputs for simulations
 **********************************
 To estimate the distribution for a model than isn't easy to solve
 analytically, :class:`NormalDist` can generate input samples for a `Monte
 Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
@ -963,6 +971,9 @@ Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
    >>> quantiles(map(model, X, Y, Z))       # doctest: +SKIP
    [1.4591308524824727, 1.8035946855390597, 2.175091447274739]
 Approximating binomial distributions
 ************************************
 Normal distributions can be used to approximate `Binomial
 distributions <https://mathworld.wolfram.com/BinomialDistribution.html>`_
 when the sample size is large and when the probability of a successful
@ -1000,6 +1011,10 @@ probability that the Python room will stay within its capacity limits?
    >>> mean(trial() <= k for i in range(10_000))
    0.8398
 Naive bayesian classifier
 *************************
 Normal distributions commonly arise in machine learning problems.
 Wikipedia has a `nice example of a Naive Bayesian Classifier
@ -1054,6 +1069,48 @@ The final prediction goes to the largest posterior. This is known as the
  'female'
 Kernel density estimation
 *************************
 It is possible to estimate a continuous probability density function
 from a fixed number of discrete samples.
 The basic idea is to smooth the data using `a kernel function such as a
 normal distribution, triangular distribution, or uniform distribution
 <https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use>`_.
 The degree of smoothing is controlled by a single
 parameter, ``h``, representing the variance of the kernel function.
 .. testcode::
   import math
   def kde_normal(sample, h):
       "Create a continous probability density function from a sample."
       # Smooth the sample with a normal distribution of variance h.
       kernel_h = NormalDist(0.0, math.sqrt(h)).pdf
       n = len(sample)
       def pdf(x):
           return sum(kernel_h(x - x_i) for x_i in sample) / n
       return pdf
 `Wikipedia has an example
 <https://en.wikipedia.org/wiki/Kernel_density_estimation#Example>`_
 where we can use the ``kde_normal()`` recipe to generate and plot
 a probability density function estimated from a small sample:
 .. doctest::
   >>> sample = [-2.1, -1.3, -0.4, 1.9, 5.1, 6.2]
   >>> f_hat = kde_normal(sample, h=2.25)
   >>> xarr = [i/100 for i in range(-750, 1100)]
   >>> yarr = [f_hat(x) for x in xarr]
 The points in ``xarr`` and ``yarr`` can be used to make a PDF plot:
 .. image:: kde_example.png
   :alt: Scatter plot of the estimated probability density function.
 ..
   # This modelines must appear within the last ten lines of the file.
   kate: indent-width 3; remove-trailing-space on; replace-tabs on; encoding utf-8;