bpo-34523: Support surrogatepass in locale codecs (GH-8995)

Add support for the "surrogatepass" error handler in
PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault()
for the UTF-8 encoding.

Changes:

* _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the
  surrogatepass error handler (_Py_ERROR_SURROGATEPASS).
* _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use
  the _Py_error_handler enum instead of "int surrogateescape" to pass
  the error handler. These functions now return -3 if the error
  handler is unknown.
* Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx()
  in test_codecs.
* Rename get_error_handler() to _Py_GetErrorHandler() and expose it
  as a private function.
* _freeze_importlib doesn't need config.filesystem_errors="strict"
  workaround anymore.
This commit is contained in:
Victor Stinner 2018-08-29 22:21:32 +02:00 committed by GitHub
parent c5989cd876
commit 3d4226a832
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 423 additions and 117 deletions

View file

@ -313,7 +313,7 @@ STRINGLIB(utf8_encoder)(PyObject *unicode,
Py_ssize_t startpos, endpos, newpos;
Py_ssize_t k;
if (error_handler == _Py_ERROR_UNKNOWN) {
error_handler = get_error_handler(errors);
error_handler = _Py_GetErrorHandler(errors);
}
startpos = i-1;