[3.12] gh-118164: Break a loop between _pydecimal and _pylong and optimize int to str conversion (GH-118483) (GH-118590)

For converting large ints to strings, CPython invokes a function in _pylong.py,
which uses the decimal module to implement an asymptotically waaaaay
sub-quadratic algorithm. But if the C decimal module isn't available, CPython
uses _pydecimal.py instead. Which in turn frequently does str(int). If the int
is very large, _pylong ends up doing the work, which in turn asks decimal to do
"big" arithmetic, which in turn calls str(big_int), which in turn ... it can
become infinite mutual recursion.

This change introduces a different int->str function that doesn't use decimal.
It's asymptotically worse, "Karatsuba time" instead of quadratic time, so
still a huge improvement. _pylong switches to that when the C decimal isn't
available. It is also used for not too large integers (less than 450_000 bits),
where it is faster (up to 2 times for 30_000 bits) than the asymptotically
better implementation that uses the C decimal.

(cherry picked from commit 711c80bfca)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
Co-authored-by: Tim Peters <tim.peters@gmail.com>
This commit is contained in:
Miss Islington (bot) 2024-05-06 11:10:05 +02:00 committed by GitHub
parent a81fe2a8f8
commit a4812fd8f7
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 70 additions and 11 deletions

View file

@ -834,17 +834,28 @@ class PyLongModuleTests(unittest.TestCase):
sys.set_int_max_str_digits(self._previous_limit)
super().tearDown()
def test_pylong_int_to_decimal(self):
n = (1 << 100_000) - 1
suffix = '9883109375'
def _test_pylong_int_to_decimal(self, n, suffix):
s = str(n)
assert s[-10:] == suffix
s = str(-n)
assert s[-10:] == suffix
s = '%d' % n
assert s[-10:] == suffix
s = b'%d' % n
assert s[-10:] == suffix.encode('ascii')
self.assertEqual(s[-10:], suffix)
s2 = str(-n)
self.assertEqual(s2, '-' + s)
s3 = '%d' % n
self.assertEqual(s3, s)
s4 = b'%d' % n
self.assertEqual(s4, s.encode('ascii'))
def test_pylong_int_to_decimal(self):
self._test_pylong_int_to_decimal((1 << 100_000), '9883109376')
self._test_pylong_int_to_decimal((1 << 100_000) - 1, '9883109375')
self._test_pylong_int_to_decimal(10**30_000, '0000000000')
self._test_pylong_int_to_decimal(10**30_000 - 1, '9999999999')
self._test_pylong_int_to_decimal(3**60_000, '9313200001')
@support.requires_resource('cpu')
def test_pylong_int_to_decimal_2(self):
self._test_pylong_int_to_decimal(2**1_000_000, '2747109376')
self._test_pylong_int_to_decimal(10**300_000, '0000000000')
self._test_pylong_int_to_decimal(3**600_000, '3132000001')
def test_pylong_int_divmod(self):
n = (1 << 100_000)