mirror of
https://github.com/python/cpython.git
synced 2025-08-19 00:00:48 +00:00
#11840: Improve c-api/unicode documentation. Patch by Sandro Tosi.
This commit is contained in:
parent
7f9d2ead34
commit
020f6508c1
1 changed files with 26 additions and 27 deletions
|
@ -371,11 +371,11 @@ the system's :ctype:`wchar_t`.
|
|||
wchar_t Support
|
||||
"""""""""""""""
|
||||
|
||||
wchar_t support for platforms which support it:
|
||||
:ctype:`wchar_t` support for platforms which support it:
|
||||
|
||||
.. cfunction:: PyObject* PyUnicode_FromWideChar(const wchar_t *w, Py_ssize_t size)
|
||||
|
||||
Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given size.
|
||||
Create a Unicode object from the :ctype:`wchar_t` buffer *w* of the given *size*.
|
||||
Return *NULL* on failure.
|
||||
|
||||
.. versionchanged:: 2.5
|
||||
|
@ -407,13 +407,13 @@ Built-in Codecs
|
|||
Python provides a set of built-in codecs which are written in C for speed. All of
|
||||
these codecs are directly usable via the following functions.
|
||||
|
||||
Many of the following APIs take two arguments encoding and errors. These
|
||||
parameters encoding and errors have the same semantics as the ones of the
|
||||
built-in :func:`unicode` Unicode object constructor.
|
||||
Many of the following APIs take two arguments encoding and errors, and they
|
||||
have the same semantics as the ones of the built-in :func:`unicode` Unicode
|
||||
object constructor.
|
||||
|
||||
Setting encoding to *NULL* causes the default encoding to be used which is
|
||||
ASCII. The file system calls should use :cdata:`Py_FileSystemDefaultEncoding`
|
||||
as the encoding for file names. This variable should be treated as read-only: On
|
||||
as the encoding for file names. This variable should be treated as read-only: on
|
||||
some systems, it will be a pointer to a static string, on others, it will change
|
||||
at run-time (such as when the application invokes setlocale).
|
||||
|
||||
|
@ -446,7 +446,7 @@ These are the generic codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size and return a Python
|
||||
Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* and return a Python
|
||||
string object. *encoding* and *errors* have the same meaning as the parameters
|
||||
of the same name in the Unicode :meth:`encode` method. The codec to be used is
|
||||
looked up using the Python codec registry. Return *NULL* if an exception was
|
||||
|
@ -498,7 +498,7 @@ These are the UTF-8 codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using UTF-8 and return a
|
||||
Encode the :ctype:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and return a
|
||||
Python string object. Return *NULL* if an exception was raised by the codec.
|
||||
|
||||
.. versionchanged:: 2.5
|
||||
|
@ -521,7 +521,7 @@ These are the UTF-32 codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
|
||||
|
||||
Decode *length* bytes from a UTF-32 encoded buffer string and return the
|
||||
Decode *size* bytes from a UTF-32 encoded buffer string and return the
|
||||
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
|
||||
handling. It defaults to "strict".
|
||||
|
||||
|
@ -597,7 +597,7 @@ These are the UTF-16 codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder)
|
||||
|
||||
Decode *length* bytes from a UTF-16 encoded buffer string and return the
|
||||
Decode *size* bytes from a UTF-16 encoded buffer string and return the
|
||||
corresponding Unicode object. *errors* (if non-*NULL*) defines the error
|
||||
handling. It defaults to "strict".
|
||||
|
||||
|
@ -722,7 +722,7 @@ These are the "Unicode Escape" codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using Unicode-Escape and
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and
|
||||
return a Python string object. Return *NULL* if an exception was raised by the
|
||||
codec.
|
||||
|
||||
|
@ -756,7 +756,7 @@ These are the "Raw Unicode Escape" codec APIs:
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using Raw-Unicode-Escape
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape
|
||||
and return a Python string object. Return *NULL* if an exception was raised by
|
||||
the codec.
|
||||
|
||||
|
@ -791,7 +791,7 @@ ordinals and only these are accepted by the codecs during encoding.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using Latin-1 and return
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using Latin-1 and return
|
||||
a Python string object. Return *NULL* if an exception was raised by the codec.
|
||||
|
||||
.. versionchanged:: 2.5
|
||||
|
@ -825,7 +825,7 @@ codes generate errors.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using ASCII and return a
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using ASCII and return a
|
||||
Python string object. Return *NULL* if an exception was raised by the codec.
|
||||
|
||||
.. versionchanged:: 2.5
|
||||
|
@ -843,8 +843,6 @@ codes generate errors.
|
|||
Character Map Codecs
|
||||
""""""""""""""""""""
|
||||
|
||||
These are the mapping codec APIs:
|
||||
|
||||
This codec is special in that it can be used to implement many different codecs
|
||||
(and this is in fact what was done to obtain most of the standard codecs
|
||||
included in the :mod:`encodings` package). The codec uses mapping to encode and
|
||||
|
@ -866,6 +864,7 @@ meaning that its ordinal value will be interpreted as Unicode or Latin-1 ordinal
|
|||
resp. Because of this, mappings only need to contain those mappings which map
|
||||
characters to different code points.
|
||||
|
||||
These are the mapping codec APIs:
|
||||
|
||||
.. cfunction:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors)
|
||||
|
||||
|
@ -886,7 +885,7 @@ characters to different code points.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using the given
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using the given
|
||||
*mapping* object and return a Python string object. Return *NULL* if an
|
||||
exception was raised by the codec.
|
||||
|
||||
|
@ -906,7 +905,7 @@ The following codec API is special in that maps Unicode to Unicode.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors)
|
||||
|
||||
Translate a :ctype:`Py_UNICODE` buffer of the given length by applying a
|
||||
Translate a :ctype:`Py_UNICODE` buffer of the given *size* by applying a
|
||||
character mapping *table* to it and return the resulting Unicode object. Return
|
||||
*NULL* when an exception was raised by the codec.
|
||||
|
||||
|
@ -921,16 +920,16 @@ The following codec API is special in that maps Unicode to Unicode.
|
|||
This function used an :ctype:`int` type for *size*. This might require
|
||||
changes in your code for properly supporting 64-bit systems.
|
||||
|
||||
|
||||
MBCS codecs for Windows
|
||||
"""""""""""""""""""""""
|
||||
|
||||
These are the MBCS codec APIs. They are currently only available on Windows and
|
||||
use the Win32 MBCS converters to implement the conversions. Note that MBCS (or
|
||||
DBCS) is a class of encodings, not just one. The target encoding is defined by
|
||||
the user settings on the machine running the codec.
|
||||
|
||||
|
||||
MBCS codecs for Windows
|
||||
"""""""""""""""""""""""
|
||||
|
||||
|
||||
.. cfunction:: PyObject* PyUnicode_DecodeMBCS(const char *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Create a Unicode object by decoding *size* bytes of the MBCS encoded string *s*.
|
||||
|
@ -953,7 +952,7 @@ MBCS codecs for Windows
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors)
|
||||
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given size using MBCS and return a
|
||||
Encode the :ctype:`Py_UNICODE` buffer of the given *size* using MBCS and return a
|
||||
Python string object. Return *NULL* if an exception was raised by the codec.
|
||||
|
||||
.. versionchanged:: 2.5
|
||||
|
@ -990,7 +989,7 @@ They all return *NULL* or ``-1`` if an exception occurs.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_Split(PyObject *s, PyObject *sep, Py_ssize_t maxsplit)
|
||||
|
||||
Split a string giving a list of Unicode strings. If sep is *NULL*, splitting
|
||||
Split a string giving a list of Unicode strings. If *sep* is *NULL*, splitting
|
||||
will be done at all whitespace substrings. Otherwise, splits occur at the given
|
||||
separator. At most *maxsplit* splits will be done. If negative, no limit is
|
||||
set. Separators are not included in the resulting list.
|
||||
|
@ -1025,13 +1024,13 @@ They all return *NULL* or ``-1`` if an exception occurs.
|
|||
|
||||
.. cfunction:: PyObject* PyUnicode_Join(PyObject *separator, PyObject *seq)
|
||||
|
||||
Join a sequence of strings using the given separator and return the resulting
|
||||
Join a sequence of strings using the given *separator* and return the resulting
|
||||
Unicode string.
|
||||
|
||||
|
||||
.. cfunction:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
|
||||
|
||||
Return 1 if *substr* matches *str*[*start*:*end*] at the given tail end
|
||||
Return 1 if *substr* matches ``str[start:end]`` at the given tail end
|
||||
(*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match),
|
||||
0 otherwise. Return ``-1`` if an error occurred.
|
||||
|
||||
|
@ -1043,7 +1042,7 @@ They all return *NULL* or ``-1`` if an exception occurs.
|
|||
|
||||
.. cfunction:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction)
|
||||
|
||||
Return the first position of *substr* in *str*[*start*:*end*] using the given
|
||||
Return the first position of *substr* in ``str[start:end]`` using the given
|
||||
*direction* (*direction* == 1 means to do a forward search, *direction* == -1 a
|
||||
backward search). The return value is the index of the first match; a value of
|
||||
``-1`` indicates that no match was found, and ``-2`` indicates that an error
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue