mirror of
https://github.com/python/cpython.git
synced 2025-08-31 14:07:50 +00:00
bpo-43510: Implement PEP 597 opt-in EncodingWarning. (GH-19481)
See [PEP 597](https://www.python.org/dev/peps/pep-0597/). * Add `-X warn_default_encoding` and `PYTHONWARNDEFAULTENCODING`. * Add EncodingWarning * Add io.text_encoding() * open(), TextIOWrapper() emits EncodingWarning when encoding is omitted and warn_default_encoding is enabled. * _pyio.TextIOWrapper() uses UTF-8 as fallback default encoding used when failed to import locale module. (used during building Python) * bz2, configparser, gzip, lzma, pathlib, tempfile modules use io.text_encoding(). * What's new entry
This commit is contained in:
parent
261a452a13
commit
4827483f47
32 changed files with 366 additions and 18 deletions
|
@ -741,6 +741,15 @@ The following exceptions are used as warning categories; see the
|
|||
Base class for warnings related to Unicode.
|
||||
|
||||
|
||||
.. exception:: EncodingWarning
|
||||
|
||||
Base class for warnings related to encodings.
|
||||
|
||||
See :ref:`io-encoding-warning` for details.
|
||||
|
||||
.. versionadded:: 3.10
|
||||
|
||||
|
||||
.. exception:: BytesWarning
|
||||
|
||||
Base class for warnings related to :class:`bytes` and :class:`bytearray`.
|
||||
|
|
|
@ -106,6 +106,56 @@ stream by opening a file in binary mode with buffering disabled::
|
|||
The raw stream API is described in detail in the docs of :class:`RawIOBase`.
|
||||
|
||||
|
||||
.. _io-text-encoding:
|
||||
|
||||
Text Encoding
|
||||
-------------
|
||||
|
||||
The default encoding of :class:`TextIOWrapper` and :func:`open` is
|
||||
locale-specific (:func:`locale.getpreferredencoding(False) <locale.getpreferredencoding>`).
|
||||
|
||||
However, many developers forget to specify the encoding when opening text files
|
||||
encoded in UTF-8 (e.g. JSON, TOML, Markdown, etc...) since most Unix
|
||||
platforms use UTF-8 locale by default. This causes bugs because the locale
|
||||
encoding is not UTF-8 for most Windows users. For example::
|
||||
|
||||
# May not work on Windows when non-ASCII characters in the file.
|
||||
with open("README.md") as f:
|
||||
long_description = f.read()
|
||||
|
||||
Additionally, while there is no concrete plan as of yet, Python may change
|
||||
the default text file encoding to UTF-8 in the future.
|
||||
|
||||
Accordingly, it is highly recommended that you specify the encoding
|
||||
explicitly when opening text files. If you want to use UTF-8, pass
|
||||
``encoding="utf-8"``. To use the current locale encoding,
|
||||
``encoding="locale"`` is supported in Python 3.10.
|
||||
|
||||
When you need to run existing code on Windows that attempts to opens
|
||||
UTF-8 files using the default locale encoding, you can enable the UTF-8
|
||||
mode. See :ref:`UTF-8 mode on Windows <win-utf8-mode>`.
|
||||
|
||||
.. _io-encoding-warning:
|
||||
|
||||
Opt-in EncodingWarning
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. versionadded:: 3.10
|
||||
See :pep:`597` for more details.
|
||||
|
||||
To find where the default locale encoding is used, you can enable
|
||||
the ``-X warn_default_encoding`` command line option or set the
|
||||
:envvar:`PYTHONWARNDEFAULTENCODING` environment variable, which will
|
||||
emit an :exc:`EncodingWarning` when the default encoding is used.
|
||||
|
||||
If you are providing an API that uses :func:`open` or
|
||||
:class:`TextIOWrapper` and passes ``encoding=None`` as a parameter, you
|
||||
can use :func:`text_encoding` so that callers of the API will emit an
|
||||
:exc:`EncodingWarning` if they don't pass an ``encoding``. However,
|
||||
please consider using UTF-8 by default (i.e. ``encoding="utf-8"``) for
|
||||
new APIs.
|
||||
|
||||
|
||||
High-level Module Interface
|
||||
---------------------------
|
||||
|
||||
|
@ -143,6 +193,32 @@ High-level Module Interface
|
|||
.. versionadded:: 3.8
|
||||
|
||||
|
||||
.. function:: text_encoding(encoding, stacklevel=2)
|
||||
|
||||
This is a helper function for callables that use :func:`open` or
|
||||
:class:`TextIOWrapper` and have an ``encoding=None`` parameter.
|
||||
|
||||
This function returns *encoding* if it is not ``None`` and ``"locale"`` if
|
||||
*encoding* is ``None``.
|
||||
|
||||
This function emits an :class:`EncodingWarning` if
|
||||
:data:`sys.flags.warn_default_encoding <sys.flags>` is true and *encoding*
|
||||
is None. *stacklevel* specifies where the warning is emitted.
|
||||
For example::
|
||||
|
||||
def read_text(path, encoding=None):
|
||||
encoding = io.text_encoding(encoding) # stacklevel=2
|
||||
with open(path, encoding) as f:
|
||||
return f.read()
|
||||
|
||||
In this example, an :class:`EncodingWarning` is emitted for the caller of
|
||||
``read_text()``.
|
||||
|
||||
See :ref:`io-text-encoding` for more information.
|
||||
|
||||
.. versionadded:: 3.10
|
||||
|
||||
|
||||
.. exception:: BlockingIOError
|
||||
|
||||
This is a compatibility alias for the builtin :exc:`BlockingIOError`
|
||||
|
@ -869,6 +945,8 @@ Text I/O
|
|||
*encoding* gives the name of the encoding that the stream will be decoded or
|
||||
encoded with. It defaults to
|
||||
:func:`locale.getpreferredencoding(False) <locale.getpreferredencoding>`.
|
||||
``encoding="locale"`` can be used to specify the current locale's encoding
|
||||
explicitly. See :ref:`io-text-encoding` for more information.
|
||||
|
||||
*errors* is an optional string that specifies how encoding and decoding
|
||||
errors are to be handled. Pass ``'strict'`` to raise a :exc:`ValueError`
|
||||
|
@ -920,6 +998,9 @@ Text I/O
|
|||
locale encoding using :func:`locale.setlocale`, use the current locale
|
||||
encoding instead of the user preferred encoding.
|
||||
|
||||
.. versionchanged:: 3.10
|
||||
The *encoding* argument now supports the ``"locale"`` dummy encoding name.
|
||||
|
||||
:class:`TextIOWrapper` provides these data attributes and methods in
|
||||
addition to those from :class:`TextIOBase` and :class:`IOBase`:
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue