gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)

* PyUnicode_AsUTF8() now raises an exception if the string contains
  embedded null characters.
* Update related C API tests (test_capi.test_unicode).
* type_new_set_doc() uses PyUnicode_AsUTF8AndSize() to silently
  truncate doc containing null bytes.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
This commit is contained in:
Victor Stinner 2023-10-20 17:59:29 +02:00 committed by GitHub
parent 59ea0f523e
commit d731579bfb
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
8 changed files with 49 additions and 25 deletions

View file

@ -443,17 +443,15 @@ PyAPI_FUNC(PyObject*) PyUnicode_AsUTF8String(
PyObject *unicode /* Unicode object */
);
/* Returns a pointer to the default encoding (UTF-8) of the
Unicode object unicode and the size of the encoded representation
in bytes stored in *size.
In case of an error, no *size is set.
This function caches the UTF-8 encoded string in the unicodeobject
and subsequent calls will return the same string. The memory is released
when the unicodeobject is deallocated.
*/
// Returns a pointer to the default encoding (UTF-8) of the
// Unicode object unicode and the size of the encoded representation
// in bytes stored in `*size` (if size is not NULL).
//
// On error, `*size` is set to 0 (if size is not NULL).
//
// This function caches the UTF-8 encoded string in the Unicode object
// and subsequent calls will return the same string. The memory is released
// when the Unicode object is deallocated.
#if !defined(Py_LIMITED_API) || Py_LIMITED_API+0 >= 0x030A0000
PyAPI_FUNC(const char *) PyUnicode_AsUTF8AndSize(
PyObject *unicode,