mirror of
https://github.com/python/cpython.git
synced 2025-11-13 15:40:05 +00:00
Elaborate on representations and canonical/legacy unicode objects
This commit is contained in:
parent
e6b99a1832
commit
b965b3938a
1 changed files with 15 additions and 1 deletions
|
|
@ -18,7 +18,21 @@ for strings where all code points are below 128, 256, or 65536; otherwise, code
|
||||||
points must be below 1114112 (which is the full Unicode range).
|
points must be below 1114112 (which is the full Unicode range).
|
||||||
|
|
||||||
:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
|
:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached
|
||||||
in the Unicode object.
|
in the Unicode object. The :c:type:`Py_UNICODE*` representation is deprecated
|
||||||
|
and inefficient; it should be avoided in performance- or memory-sensitive
|
||||||
|
situations.
|
||||||
|
|
||||||
|
Due to the transition between the old APIs and the new APIs, unicode objects
|
||||||
|
can internally be in two states depending on how they were created:
|
||||||
|
|
||||||
|
* "canonical" unicode objects are all objects created by a non-deprecated
|
||||||
|
unicode API. They use the most efficient representation allowed by the
|
||||||
|
implementation.
|
||||||
|
|
||||||
|
* "legacy" unicode objects have been created through one of the deprecated
|
||||||
|
APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the
|
||||||
|
:c:type:`Py_UNICODE*` representation; you will have to call
|
||||||
|
:c:func:`PyUnicode_READY` on them before calling any other API.
|
||||||
|
|
||||||
|
|
||||||
Unicode Type
|
Unicode Type
|
||||||
|
|
|
||||||
Loading…
Add table
Add a link
Reference in a new issue