mirror of
https://github.com/python/cpython.git
synced 2025-08-15 22:30:42 +00:00
[3.12] gh-106931: Intern Statically Allocated Strings Globally (gh-107272) (gh-110713)
We tried this before with a dict and for all interned strings. That ran into problems due to interpreter isolation. However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning. Here we circle back to using a global cache, but only for statically allocated strings. We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.
Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter. Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
(cherry-picked from commit b72947a8d2
)
This commit is contained in:
parent
60a08e6ff2
commit
4f71f1680d
11 changed files with 4324 additions and 4186 deletions
|
@ -208,6 +208,7 @@ class Printer:
|
|||
self.write(".kind = 1,")
|
||||
self.write(".compact = 1,")
|
||||
self.write(".ascii = 1,")
|
||||
self.write(".statically_allocated = 1,")
|
||||
self.write(f"._data = {make_string_literal(s.encode('ascii'))},")
|
||||
return f"& {name}._ascii.ob_base"
|
||||
else:
|
||||
|
@ -220,6 +221,7 @@ class Printer:
|
|||
self.write(f".kind = {kind},")
|
||||
self.write(".compact = 1,")
|
||||
self.write(".ascii = 0,")
|
||||
self.write(".statically_allocated = 1,")
|
||||
utf8 = s.encode('utf-8')
|
||||
self.write(f'.utf8 = {make_string_literal(utf8)},')
|
||||
self.write(f'.utf8_length = {len(utf8)},')
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue