bpo-36443: Disable C locale coercion and UTF-8 Mode by default (GH-12589)

bpo-36443, bpo-36202: Since Python 3.7.0, calling Py_DecodeLocale()
before Py_Initialize() produces mojibake if the LC_CTYPE locale is
coerced and/or if the UTF-8 Mode is enabled by the user
configuration. This change fix the issue by disabling LC_CTYPE
coercion and UTF-8 Mode by default. They must now be enabled
explicitly (opt-in) using the new _Py_PreInitialize() API with
_PyPreConfig.

When embedding Python, set coerce_c_locale and utf8_mode attributes
of _PyPreConfig to -1 to enable automatically these parameters
depending on the LC_CTYPE locale, environment variables and command
line arguments

Alternative: Setting Py_UTF8Mode to 1 always explicitly enables the
UTF-8 Mode.

Changes:

* _PyPreConfig_INIT now sets coerce_c_locale and utf8_mode to 0 by
  default.
* _Py_InitializeFromArgs() and _Py_InitializeFromWideArgs() can now
  be called with config=NULL.
This commit is contained in:
Victor Stinner 2019-03-27 18:28:46 +01:00 committed by GitHub
parent 4a9a505d6f
commit d929f1838a
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 58 additions and 46 deletions

View file

@ -386,7 +386,9 @@ _PyPreConfig_GetGlobalConfig(_PyPreConfig *config)
#ifdef MS_WINDOWS
COPY_FLAG(legacy_windows_fs_encoding, Py_LegacyWindowsFSEncodingFlag);
#endif
COPY_FLAG(utf8_mode, Py_UTF8Mode);
if (Py_UTF8Mode > 0) {
config->utf8_mode = 1;
}
#undef COPY_FLAG
#undef COPY_NOT_FLAG