bpo-38256: Fix binascii.crc32() when inputs are 4+GiB (GH-32000)

When compiled with `USE_ZLIB_CRC32` defined (`configure` sets this on POSIX systems), `binascii.crc32(...)` failed to compute the correct value when the input data was >= 4GiB. Because the zlib crc32 API is limited to a 32-bit length.

This lines it up with the `zlib.crc32(...)` implementation that doesn't have that flaw.

**Performance:** This also adopts the same GIL releasing for larger inputs logic that `zlib.crc32` has, and causes the Windows build to always use zlib's crc32 instead of our slow C code as zlib is a required build dependency on Windows.
This commit is contained in:
Gregory P. Smith 2022-03-20 12:28:15 -07:00 committed by GitHub
parent 3ae975f1ac
commit 9d1c4d69db
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 87 additions and 31 deletions

View file

@ -1420,7 +1420,7 @@ zlib_adler32_impl(PyObject *module, Py_buffer *data, unsigned int value)
}
/*[clinic input]
zlib.crc32
zlib.crc32 -> unsigned_int
data: Py_buffer
value: unsigned_int(bitwise=True) = 0
@ -1432,9 +1432,9 @@ Compute a CRC-32 checksum of data.
The returned checksum is an integer.
[clinic start generated code]*/
static PyObject *
static unsigned int
zlib_crc32_impl(PyObject *module, Py_buffer *data, unsigned int value)
/*[clinic end generated code: output=63499fa20af7ea25 input=26c3ed430fa00b4c]*/
/*[clinic end generated code: output=b217562e4fe6d6a6 input=1229cb2fb5ea948a]*/
{
/* Releasing the GIL for very small buffers is inefficient
and may lower performance */
@ -1455,7 +1455,7 @@ zlib_crc32_impl(PyObject *module, Py_buffer *data, unsigned int value)
} else {
value = crc32(value, data->buf, (unsigned int)data->len);
}
return PyLong_FromUnsignedLong(value & 0xffffffffU);
return value;
}