mirror of
https://github.com/python/cpython.git
synced 2025-08-03 16:39:00 +00:00
bpo-42236: Enhance init and encoding documentation (GH-23109)
Enhance the documentation of the Python startup, filesystem encoding and error handling, locale encoding. Add a new "Python UTF-8 Mode" section. * Add "locale encoding" and "filesystem encoding and error handler" to the glossary * Remove documentation from Include/cpython/initconfig.h: move it to Doc/c-api/init_config.rst. * Doc/c-api/init_config.rst: * Document command line options and environment variables * Document default values. * Add a new "Python UTF-8 Mode" section in Doc/library/os.rst. * Add warnings to Py_DecodeLocale() and Py_EncodeLocale() docs. * Document how Python selects the filesystem encoding and error handler at a single place: PyConfig.filesystem_encoding and PyConfig.filesystem_errors. * PyConfig: move orig_argv member at the right place.
This commit is contained in:
parent
301822859b
commit
4b9aad4999
19 changed files with 735 additions and 520 deletions
|
@ -447,10 +447,9 @@ Miscellaneous options
|
|||
* ``-X dev``: enable :ref:`Python Development Mode <devmode>`, introducing
|
||||
additional runtime checks that are too expensive to be enabled by
|
||||
default.
|
||||
* ``-X utf8`` enables UTF-8 mode for operating system interfaces, overriding
|
||||
the default locale-aware mode. ``-X utf8=0`` explicitly disables UTF-8
|
||||
mode (even when it would otherwise activate automatically).
|
||||
See :envvar:`PYTHONUTF8` for more details.
|
||||
* ``-X utf8`` enables the :ref:`Python UTF-8 Mode <utf8-mode>`.
|
||||
``-X utf8=0`` explicitly disables :ref:`Python UTF-8 Mode <utf8-mode>`
|
||||
(even when it would otherwise activate automatically).
|
||||
* ``-X pycache_prefix=PATH`` enables writing ``.pyc`` files to a parallel
|
||||
tree rooted at the given directory instead of to the code tree. See also
|
||||
:envvar:`PYTHONPYCACHEPREFIX`.
|
||||
|
@ -810,9 +809,10 @@ conflict.
|
|||
|
||||
.. envvar:: PYTHONLEGACYWINDOWSFSENCODING
|
||||
|
||||
If set to a non-empty string, the default filesystem encoding and errors mode
|
||||
will revert to their pre-3.6 values of 'mbcs' and 'replace', respectively.
|
||||
Otherwise, the new defaults 'utf-8' and 'surrogatepass' are used.
|
||||
If set to a non-empty string, the default :term:`filesystem encoding and
|
||||
error handler` mode will revert to their pre-3.6 values of 'mbcs' and
|
||||
'replace', respectively. Otherwise, the new defaults 'utf-8' and
|
||||
'surrogatepass' are used.
|
||||
|
||||
This may also be enabled at runtime with
|
||||
:func:`sys._enablelegacywindowsfsencoding()`.
|
||||
|
@ -898,54 +898,14 @@ conflict.
|
|||
|
||||
.. envvar:: PYTHONUTF8
|
||||
|
||||
If set to ``1``, enables the interpreter's UTF-8 mode, where ``UTF-8`` is
|
||||
used as the text encoding for system interfaces, regardless of the
|
||||
current locale setting.
|
||||
If set to ``1``, enable the :ref:`Python UTF-8 Mode <utf8-mode>`.
|
||||
|
||||
This means that:
|
||||
|
||||
* :func:`sys.getfilesystemencoding()` returns ``'UTF-8'`` (the locale
|
||||
encoding is ignored).
|
||||
* :func:`locale.getpreferredencoding()` returns ``'UTF-8'`` (the locale
|
||||
encoding is ignored, and the function's ``do_setlocale`` parameter has no
|
||||
effect).
|
||||
* :data:`sys.stdin`, :data:`sys.stdout`, and :data:`sys.stderr` all use
|
||||
UTF-8 as their text encoding, with the ``surrogateescape``
|
||||
:ref:`error handler <error-handlers>` being enabled for :data:`sys.stdin`
|
||||
and :data:`sys.stdout` (:data:`sys.stderr` continues to use
|
||||
``backslashreplace`` as it does in the default locale-aware mode)
|
||||
|
||||
As a consequence of the changes in those lower level APIs, other higher
|
||||
level APIs also exhibit different default behaviours:
|
||||
|
||||
* Command line arguments, environment variables and filenames are decoded
|
||||
to text using the UTF-8 encoding.
|
||||
* :func:`os.fsdecode()` and :func:`os.fsencode()` use the UTF-8 encoding.
|
||||
* :func:`open()`, :func:`io.open()`, and :func:`codecs.open()` use the UTF-8
|
||||
encoding by default. However, they still use the strict error handler by
|
||||
default so that attempting to open a binary file in text mode is likely
|
||||
to raise an exception rather than producing nonsense data.
|
||||
|
||||
Note that the standard stream settings in UTF-8 mode can be overridden by
|
||||
:envvar:`PYTHONIOENCODING` (just as they can be in the default locale-aware
|
||||
mode).
|
||||
|
||||
If set to ``0``, the interpreter runs in its default locale-aware mode.
|
||||
If set to ``0``, disable the :ref:`Python UTF-8 Mode <utf8-mode>`.
|
||||
|
||||
Setting any other non-empty string causes an error during interpreter
|
||||
initialisation.
|
||||
|
||||
If this environment variable is not set at all, then the interpreter defaults
|
||||
to using the current locale settings, *unless* the current locale is
|
||||
identified as a legacy ASCII-based locale
|
||||
(as described for :envvar:`PYTHONCOERCECLOCALE`), and locale coercion is
|
||||
either disabled or fails. In such legacy locales, the interpreter will
|
||||
default to enabling UTF-8 mode unless explicitly instructed not to do so.
|
||||
|
||||
Also available as the :option:`-X` ``utf8`` option.
|
||||
|
||||
.. versionadded:: 3.7
|
||||
See :pep:`540` for more details.
|
||||
|
||||
|
||||
Debug-mode variables
|
||||
|
|
|
@ -614,21 +614,14 @@ Page). Python uses it for the default encoding of text files (e.g.
|
|||
This may cause issues because UTF-8 is widely used on the internet
|
||||
and most Unix systems, including WSL (Windows Subsystem for Linux).
|
||||
|
||||
You can use UTF-8 mode to change the default text encoding to UTF-8.
|
||||
You can enable UTF-8 mode via the ``-X utf8`` command line option, or
|
||||
the ``PYTHONUTF8=1`` environment variable. See :envvar:`PYTHONUTF8` for
|
||||
enabling UTF-8 mode, and :ref:`setting-envvars` for how to modify
|
||||
environment variables.
|
||||
You can use the :ref:`Python UTF-8 Mode <utf8-mode>` to change the default text
|
||||
encoding to UTF-8. You can enable the :ref:`Python UTF-8 Mode <utf8-mode>` via
|
||||
the ``-X utf8`` command line option, or the ``PYTHONUTF8=1`` environment
|
||||
variable. See :envvar:`PYTHONUTF8` for enabling UTF-8 mode, and
|
||||
:ref:`setting-envvars` for how to modify environment variables.
|
||||
|
||||
When UTF-8 mode is enabled:
|
||||
|
||||
* :func:`locale.getpreferredencoding` returns ``'UTF-8'`` instead of
|
||||
the system encoding. This function is used for the default text
|
||||
encoding in many places, including :func:`open`, :class:`Popen`,
|
||||
:meth:`Path.read_text`, etc.
|
||||
* :data:`sys.stdin`, :data:`sys.stdout`, and :data:`sys.stderr`
|
||||
all use UTF-8 as their text encoding.
|
||||
* You can still use the system encoding via the "mbcs" codec.
|
||||
When the :ref:`Python UTF-8 Mode <utf8-mode>` is enabled, you can still use the
|
||||
system encoding (the ANSI Code Page) via the "mbcs" codec.
|
||||
|
||||
Note that adding ``PYTHONUTF8=1`` to the default environment variables
|
||||
will affect all Python 3.7+ applications on your system.
|
||||
|
@ -641,7 +634,8 @@ temporarily or use the ``-X utf8`` command line option.
|
|||
on Windows for:
|
||||
|
||||
* Console I/O including standard I/O (see :pep:`528` for details).
|
||||
* The filesystem encoding (see :pep:`529` for details).
|
||||
* The :term:`filesystem encoding <filesystem encoding and error handler>`
|
||||
(see :pep:`529` for details).
|
||||
|
||||
|
||||
.. _launcher:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue