mirror of
https://github.com/python/cpython.git
synced 2025-11-25 04:34:37 +00:00
bpo-29240: Fix locale encodings in UTF-8 Mode (#5170)
Modify locale.localeconv(), time.tzname, os.strerror() and other functions to ignore the UTF-8 Mode: always use the current locale encoding. Changes: * Add _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx(). On decoding or encoding error, they return the position of the error and an error message which are used to raise Unicode errors in PyUnicode_DecodeLocale() and PyUnicode_EncodeLocale(). * Replace _Py_DecodeCurrentLocale() with _Py_DecodeLocaleEx(). * PyUnicode_DecodeLocale() now uses _Py_DecodeLocaleEx() for all cases, especially for the strict error handler. * Add _Py_DecodeUTF8Ex(): return more information on decoding error and supports the strict error handler. * Rename _Py_EncodeUTF8_surrogateescape() to _Py_EncodeUTF8Ex(). * Replace _Py_EncodeCurrentLocale() with _Py_EncodeLocaleEx(). * Ignore the UTF-8 mode to encode/decode localeconv(), strerror() and time zone name. * Remove PyUnicode_DecodeLocale(), PyUnicode_DecodeLocaleAndSize() and PyUnicode_EncodeLocale() now ignore the UTF-8 mode: always use the "current" locale. * Remove _PyUnicode_DecodeCurrentLocale(), _PyUnicode_DecodeCurrentLocaleAndSize() and _PyUnicode_EncodeCurrentLocale().
This commit is contained in:
parent
ee3b83547c
commit
7ed7aead95
12 changed files with 484 additions and 517 deletions
|
|
@ -106,6 +106,16 @@ Operating System Utilities
|
|||
surrogate character, escape the bytes using the surrogateescape error
|
||||
handler instead of decoding them.
|
||||
|
||||
Encoding, highest priority to lowest priority:
|
||||
|
||||
* ``UTF-8`` on macOS and Android;
|
||||
* ``UTF-8`` if the Python UTF-8 mode is enabled;
|
||||
* ``ASCII`` if the ``LC_CTYPE`` locale is ``"C"``,
|
||||
``nl_langinfo(CODESET)`` returns the ``ASCII`` encoding (or an alias),
|
||||
and :c:func:`mbstowcs` and :c:func:`wcstombs` functions uses the
|
||||
``ISO-8859-1`` encoding.
|
||||
* the current locale encoding.
|
||||
|
||||
Return a pointer to a newly allocated wide character string, use
|
||||
:c:func:`PyMem_RawFree` to free the memory. If size is not ``NULL``, write
|
||||
the number of wide characters excluding the null character into ``*size``
|
||||
|
|
@ -137,6 +147,18 @@ Operating System Utilities
|
|||
:ref:`surrogateescape error handler <surrogateescape>`: surrogate characters
|
||||
in the range U+DC80..U+DCFF are converted to bytes 0x80..0xFF.
|
||||
|
||||
Encoding, highest priority to lowest priority:
|
||||
|
||||
* ``UTF-8`` on macOS and Android;
|
||||
* ``UTF-8`` if the Python UTF-8 mode is enabled;
|
||||
* ``ASCII`` if the ``LC_CTYPE`` locale is ``"C"``,
|
||||
``nl_langinfo(CODESET)`` returns the ``ASCII`` encoding (or an alias),
|
||||
and :c:func:`mbstowcs` and :c:func:`wcstombs` functions uses the
|
||||
``ISO-8859-1`` encoding.
|
||||
* the current locale encoding.
|
||||
|
||||
The function uses the UTF-8 encoding in the Python UTF-8 mode.
|
||||
|
||||
Return a pointer to a newly allocated byte string, use :c:func:`PyMem_Free`
|
||||
to free the memory. Return ``NULL`` on encoding error or memory allocation
|
||||
error
|
||||
|
|
|
|||
|
|
@ -770,12 +770,20 @@ system.
|
|||
:c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
|
||||
Python startup).
|
||||
|
||||
This function ignores the Python UTF-8 mode.
|
||||
|
||||
.. seealso::
|
||||
|
||||
The :c:func:`Py_DecodeLocale` function.
|
||||
|
||||
.. versionadded:: 3.3
|
||||
|
||||
.. versionchanged:: 3.7
|
||||
The function now also uses the current locale encoding for the
|
||||
``surrogateescape`` error handler. Previously, :c:func:`Py_DecodeLocale`
|
||||
was used for the ``surrogateescape``, and the current locale encoding was
|
||||
used for ``strict``.
|
||||
|
||||
|
||||
.. c:function:: PyObject* PyUnicode_DecodeLocale(const char *str, const char *errors)
|
||||
|
||||
|
|
@ -797,12 +805,20 @@ system.
|
|||
:c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at
|
||||
Python startup).
|
||||
|
||||
This function ignores the Python UTF-8 mode.
|
||||
|
||||
.. seealso::
|
||||
|
||||
The :c:func:`Py_EncodeLocale` function.
|
||||
|
||||
.. versionadded:: 3.3
|
||||
|
||||
.. versionchanged:: 3.7
|
||||
The function now also uses the current locale encoding for the
|
||||
``surrogateescape`` error handler. Previously, :c:func:`Py_EncodeLocale`
|
||||
was used for the ``surrogateescape``, and the current locale encoding was
|
||||
used for ``strict``.
|
||||
|
||||
|
||||
File System Encoding
|
||||
""""""""""""""""""""
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue