cpython/Python
Victor Stinner e662c398d8
bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086)
If the nl_langinfo(CODESET) function returns an empty string, Python
now uses UTF-8 as the filesystem encoding.

In May 2010 (commit b744ba1d14), I
modified Python to log a warning and use UTF-8 as the filesystem
encoding (instead of None) if nl_langinfo(CODESET) returns an empty
string.

In August 2020 (commit 94908bbc15), I
modified Python startup to fail with a fatal error and a specific
error message if nl_langinfo(CODESET) returns an empty string. The
intent was to prevent guessing the encoding and also investigate user
configuration where this case happens.

In 10 years (2010 to 2020), I saw zero user report about the error
message related to nl_langinfo(CODESET) returning an empty string.

Today, UTF-8 became the defacto standard and it's safe to make the
assumption that the user expects UTF-8. For example,
nl_langinfo(CODESET) can return an empty string on macOS if the
LC_CTYPE locale is not supported, and UTF-8 is the default encoding
on macOS.

While this change is likely to not affect anyone in practice, it
should make UTF-8 lover happy ;-)

Rewrite also the documentation explaining how Python selects the
filesystem encoding and error handler.
2020-11-01 23:07:23 +01:00
..
clinic bpo-40471: Fix grammar typo in 'issubclass' docstring (GH-19847) 2020-06-03 06:19:45 -07:00
_warnings.c bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995) 2020-10-27 02:24:34 +01:00
asdl.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
ast.c bpo-42000: Cleanup the AST related C-code (GH-22641) 2020-10-10 10:14:59 -07:00
ast_opt.c bpo-38605: Make 'from __future__ import annotations' the default (GH-20434) 2020-10-06 13:03:02 -07:00
ast_unparse.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
bltinmodule.c bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986) 2020-10-26 12:47:57 +02:00
bootstrap_hash.c bpo-40910: Export Py_GetArgcArgv() function (GH-20721) 2020-06-08 18:12:59 +02:00
ceval.c bpo-42099: Fix reference to ob_type in unionobject.c and ceval (GH-22829) 2020-10-27 18:55:52 +00:00
ceval_gil.h
codecs.c bpo-42157: unicodedata avoids references to UCD_Type (GH-22990) 2020-10-26 19:19:36 +01:00
compile.c bpo-42161: Use _PyLong_GetZero() and _PyLong_GetOne() (GH-22995) 2020-10-27 02:24:34 +01:00
condvar.h
context.c bpo-40521: Fix _PyContext_Fini() (GH-21103) 2020-06-24 03:21:15 +02:00
dtoa.c bpo-40780: Fix failure of _Py_dg_dtoa to remove trailing zeros (GH-20435) 2020-05-29 14:23:57 +01:00
dup2.c
dynamic_annotations.c
dynload_hpux.c bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466) 2020-10-15 10:53:27 +09:00
dynload_shlib.c bpo-41894: Fix UnicodeDecodeError while loading native module (GH-22466) 2020-10-15 10:53:27 +09:00
dynload_stub.c
dynload_win.c bpo-36346: Make using the legacy Unicode C API optional (GH-21437) 2020-07-10 23:26:06 +03:00
errors.c bpo-42152: Use PyDict_Contains and PyDict_SetDefault if appropriate. (GH-22986) 2020-10-26 12:47:57 +02:00
fileutils.c bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086) 2020-11-01 23:07:23 +01:00
formatter_unicode.c bpo-41681: Fix for f-string/str.format error description when using 2 , in format specifier (GH-22036) 2020-09-01 10:34:29 -04:00
frozen.c
frozenmain.c
future.c bpo-38605: Make 'from __future__ import annotations' the default (GH-20434) 2020-10-06 13:03:02 -07:00
getargs.c bpo-41078: Rename pycore_tupleobject.h to pycore_tuple.h (GH-21056) 2020-06-22 17:27:35 +02:00
getcompiler.c
getcopyright.c
getopt.c bpo-40527: Fix command line argument parsing (GH-19955) 2020-05-06 22:22:17 +09:00
getplatform.c
getversion.c
hamt.c bpo-29882: Add _Py_popcount32() function (GH-20518) 2020-06-08 16:30:33 +02:00
hashtable.c bpo-41061: Fix incorrect expressions in hashtable (GH-21028) 2020-06-22 00:41:48 -07:00
import.c bpo-42208: Move _PyImport_Cleanup() to pylifecycle.c (GH-23040) 2020-10-30 18:03:28 +01:00
importdl.c
importdl.h
importlib.h bpo-41323: Perform 'peephole' optimizations directly on the CFG. (GH-21517) 2020-07-30 10:03:00 +01:00
importlib_external.h bpo-38605: bump the magic number for 'annotations' future (#22630) 2020-10-10 15:19:46 -07:00
importlib_zipimport.h bpo-41323: Perform 'peephole' optimizations directly on the CFG. (GH-21517) 2020-07-30 10:03:00 +01:00
initconfig.c bpo-42236: Use UTF-8 encoding if nl_langinfo(CODESET) fails (GH-23086) 2020-11-01 23:07:23 +01:00
makeopcodetargets.py
marshal.c bpo-1635741: Port mashal module to multi-phase init (#22149) 2020-09-08 15:33:52 +02:00
modsupport.c closes bpo-41533: Fix a potential memory leak when allocating a stack (GH-21847) 2020-08-29 23:53:08 -05:00
mysnprintf.c bpo-36020: Require vsnprintf() to build Python (GH-20899) 2020-06-16 00:54:44 +02:00
mystrtoul.c
opcode_targets.h
pathconfig.c bpo-29778: test_embed tests the path configuration (GH-21306) 2020-07-08 00:20:37 +02:00
preconfig.c _PyPreConfig_Read() decodes argv at each iteration (GH-20786) 2020-06-10 19:33:11 +02:00
pyarena.c
pyctype.c
pyfpe.c
pyhash.c bpo-40943: Replace PY_FORMAT_SIZE_T with "z" (GH-20781) 2020-06-10 18:38:05 +02:00
pylifecycle.c bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044) 2020-10-30 22:51:02 +01:00
pymath.c bpo-29782: Consolidate _Py_Bit_Length() (GH-20739) 2020-06-15 14:33:48 +02:00
pystate.c bpo-42208: Call GC collect earlier in PyInterpreterState_Clear() (GH-23044) 2020-10-30 22:51:02 +01:00
pystrcmp.c bpo-41524: fix pointer bug in PyOS_mystr{n}icmp (GH-21845) 2020-08-27 14:45:25 +09:00
pystrhex.c
pystrtod.c
Python-ast.c bpo-41746: Add type information to asdl_seq objects (GH-22223) 2020-09-16 19:42:00 +01:00
pythonrun.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
pytime.c bpo-40650: Include winsock2.h in pytime.c, instead of a full windows.h (GH-20137) 2020-05-18 17:22:53 +01:00
README
structmember.c
symtable.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
sysmodule.c bpo-42006: Stop using PyDict_GetItem, PyDict_GetItemString and _PyDict_GetItemId. (GH-22648) 2020-10-26 08:43:39 +02:00
thread.c
thread_nt.h
thread_pthread.h Fix -Wstrict-prototypes warning in thread_pthread.h. (GH-21477) 2020-07-15 08:12:05 -05:00
traceback.c
wordcode_helpers.h

Miscellaneous source files for the main Python shared library