Issue #23181: More "codepoint" -> "code point".

2025-11-03 03:22:27 +00:00 · 2015-01-18 11:28:37 +02:00 · 2015-01-18 11:28:37 +02:00 · d3faf43f9b
commit d3faf43f9b
parent b2653b344e
24 changed files with 46 additions and 46 deletions
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@ -1134,7 +1134,7 @@ These are the UTF-32 codec APIs:
   mark (U+FEFF). In the other two modes, no BOM mark is prepended.

   If *Py_UNICODE_WIDE* is not defined, surrogate pairs will be output
-   as a single codepoint.
+   as a single code point.

   Return *NULL* if an exception was raised by the codec.

--- a/Doc/library/codecs.rst
+++ b/Doc/library/codecs.rst
@ -827,7 +827,7 @@ methods and attributes from the underlying stream.
 Encodings and Unicode
 ---------------------

-Strings are stored internally as sequences of codepoints in
+Strings are stored internally as sequences of code points in
 range ``0x0``-``0x10FFFF``.  (See :pep:`393` for
 more details about the implementation.)
 Once a string object is used outside of CPU and memory, endianness
@ -838,23 +838,23 @@ There are a variety of different text serialisation codecs, which are
 collectivity referred to as :term:`text encodings <text encoding>`.

 The simplest text encoding (called ``'latin-1'`` or ``'iso-8859-1'``) maps
-the codepoints 0-255 to the bytes ``0x0``-``0xff``, which means that a string
-object that contains codepoints above ``U+00FF`` can't be encoded with this
+the code points 0-255 to the bytes ``0x0``-``0xff``, which means that a string
+object that contains code points above ``U+00FF`` can't be encoded with this
 codec. Doing so will raise a :exc:`UnicodeEncodeError` that looks
 like the following (although the details of the error message may differ):
 ``UnicodeEncodeError: 'latin-1' codec can't encode character '\u1234' in
 position 3: ordinal not in range(256)``.

 There's another group of encodings (the so called charmap encodings) that choose
-a different subset of all Unicode code points and how these codepoints are
+a different subset of all Unicode code points and how these code points are
 mapped to the bytes ``0x0``-``0xff``. To see how this is done simply open
 e.g. :file:`encodings/cp1252.py` (which is an encoding that is used primarily on
 Windows). There's a string constant with 256 characters that shows you which
 character is mapped to which byte value.

-All of these encodings can only encode 256 of the 1114112 codepoints
+All of these encodings can only encode 256 of the 1114112 code points
 defined in Unicode. A simple and straightforward way that can store each Unicode
-code point, is to store each codepoint as four consecutive bytes. There are two
+code point, is to store each code point as four consecutive bytes. There are two
 possibilities: store the bytes in big endian or in little endian order. These
 two encodings are called ``UTF-32-BE`` and ``UTF-32-LE`` respectively. Their
 disadvantage is that if e.g. you use ``UTF-32-BE`` on a little endian machine you
--- a/Doc/library/email.mime.rst
+++ b/Doc/library/email.mime.rst
@ -194,7 +194,7 @@ Here are the classes:
   minor type and defaults to :mimetype:`plain`.  *_charset* is the character
   set of the text and is passed as an argument to the
   :class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults
-   to ``us-ascii`` if the string contains only ``ascii`` codepoints, and
+   to ``us-ascii`` if the string contains only ``ascii`` code points, and
   ``utf-8`` otherwise.

   Unless the *_charset* argument is explicitly set to ``None``, the
--- a/Doc/library/functions.rst
+++ b/Doc/library/functions.rst
@ -156,7 +156,7 @@ are always available.  They are listed here in alphabetical order.

 .. function:: chr(i)

-   Return the string representing a character whose Unicode codepoint is the integer
+   Return the string representing a character whose Unicode code point is the integer
   *i*.  For example, ``chr(97)`` returns the string ``'a'``. This is the
   inverse of :func:`ord`.  The valid range for the argument is from 0 through
   1,114,111 (0x10FFFF in base 16).  :exc:`ValueError` will be raised if *i* is
--- a/Doc/library/html.entities.rst
+++ b/Doc/library/html.entities.rst
@ -33,12 +33,12 @@ This module defines four dictionaries, :data:`html5`,

 .. data:: name2codepoint

-   A dictionary that maps HTML entity names to the Unicode codepoints.
+   A dictionary that maps HTML entity names to the Unicode code points.


 .. data:: codepoint2name

-   A dictionary that maps Unicode codepoints to HTML entity names.
+   A dictionary that maps Unicode code points to HTML entity names.


 .. rubric:: Footnotes
--- a/Doc/library/json.rst
+++ b/Doc/library/json.rst
@ -512,7 +512,7 @@ The RFC does not explicitly forbid JSON strings which contain byte sequences
 that don't correspond to valid Unicode characters (e.g. unpaired UTF-16
 surrogates), but it does note that they may cause interoperability problems.
 By default, this module accepts and outputs (when present in the original
-:class:`str`) codepoints for such sequences.
+:class:`str`) code points for such sequences.


 Infinite and NaN Number Values
--- a/Doc/tutorial/datastructures.rst
+++ b/Doc/tutorial/datastructures.rst
@ -684,7 +684,7 @@ the same type, the lexicographical comparison is carried out recursively.  If
 all items of two sequences compare equal, the sequences are considered equal.
 If one sequence is an initial sub-sequence of the other, the shorter sequence is
 the smaller (lesser) one.  Lexicographical ordering for strings uses the Unicode
-codepoint number to order individual characters.  Some examples of comparisons
+code point number to order individual characters.  Some examples of comparisons
 between sequences of the same type::

   (1, 2, 3)              < (1, 2, 4)
--- a/Doc/whatsnew/3.3.rst
+++ b/Doc/whatsnew/3.3.rst
@ -228,7 +228,7 @@ Functionality

 Changes introduced by :pep:`393` are the following:

-* Python now always supports the full range of Unicode codepoints, including
+* Python now always supports the full range of Unicode code points, including
  non-BMP ones (i.e. from ``U+0000`` to ``U+10FFFF``).  The distinction between
  narrow and wide builds no longer exists and Python now behaves like a wide
  build, even under Windows.
@ -246,7 +246,7 @@ Changes introduced by :pep:`393` are the following:
    so ``'\U0010FFFF'[0]`` now returns ``'\U0010FFFF'`` and not ``'\uDBFF'``;

  * all other functions in the standard library now correctly handle
-    non-BMP codepoints.
+    non-BMP code points.

 * The value of :data:`sys.maxunicode` is now always ``1114111`` (``0x10FFFF``
  in hexadecimal).  The :c:func:`PyUnicode_GetMax` function still returns
@ -258,13 +258,13 @@ Changes introduced by :pep:`393` are the following:
 Performance and resource usage
 ------------------------------

-The storage of Unicode strings now depends on the highest codepoint in the string:
+The storage of Unicode strings now depends on the highest code point in the string:

-* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per codepoint;
+* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;

-* BMP strings (``U+0000-U+FFFF``) use 2 bytes per codepoint;
+* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;

-* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per codepoint.
+* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.

 The net effect is that for most applications, memory usage of string
 storage should decrease significantly - especially compared to former