mirror of
https://github.com/python/cpython.git
synced 2025-12-04 00:30:19 +00:00
Get rid of the superstitious "~" in dict hashing's "i = (~hash) & mask".
The comment following used to say:
/* We use ~hash instead of hash, as degenerate hash functions, such
as for ints <sigh>, can have lots of leading zeros. It's not
really a performance risk, but better safe than sorry.
12-Dec-00 tim: so ~hash produces lots of leading ones instead --
what's the gain? */
That is, there was never a good reason for doing it. And to the contrary,
as explained on Python-Dev last December, it tended to make the *sum*
(i + incr) & mask (which is the first table index examined in case of
collison) the same "too often" across distinct hashes.
Changing to the simpler "i = hash & mask" reduced the number of string-dict
collisions (== # number of times we go around the lookup for-loop) from about
6 million to 5 million during a full run of the test suite (these are
approximate because the test suite does some random stuff from run to run).
The number of collisions in non-string dicts also decreased, but not as
dramatically.
Note that this may, for a given dict, change the order (wrt previous
releases) of entries exposed by .keys(), .values() and .items(). A number
of std tests suffered bogus failures as a result. For dicts keyed by
small ints, or (less so) by characters, the order is much more likely to be
in increasing order of key now; e.g.,
>>> d = {}
>>> for i in range(10):
... d[i] = i
...
>>> d
{0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9}
>>>
Unfortunately. people may latch on to that in small examples and draw a
bogus conclusion.
test_support.py
Moved test_extcall's sortdict() into test_support, made it stronger,
and imported sortdict into other std tests that needed it.
test_unicode.py
Excluced cp875 from the "roundtrip over range(128)" test, because
cp875 doesn't have a well-defined inverse for unicode("?", "cp875").
See Python-Dev for excruciating details.
Cookie.py
Chaged various output functions to sort dicts before building
strings from them.
test_extcall
Fiddled the expected-result file. This remains sensitive to native
dict ordering, because, e.g., if there are multiple errors in a
keyword-arg dict (and test_extcall sets up many cases like that), the
specific error Python complains about first depends on native dict
ordering.
This commit is contained in:
parent
0194ad5c7d
commit
2f228e75e4
11 changed files with 64 additions and 46 deletions
|
|
@ -1,14 +1,6 @@
|
|||
from test_support import verify, verbose, TestFailed
|
||||
from test_support import verify, verbose, TestFailed, sortdict
|
||||
from UserList import UserList
|
||||
|
||||
def sortdict(d):
|
||||
keys = d.keys()
|
||||
keys.sort()
|
||||
lst = []
|
||||
for k in keys:
|
||||
lst.append("%r: %r" % (k, d[k]))
|
||||
return "{%s}" % ", ".join(lst)
|
||||
|
||||
def f(*a, **k):
|
||||
print a, sortdict(k)
|
||||
|
||||
|
|
@ -228,8 +220,9 @@ for args in ['', 'a', 'ab']:
|
|||
lambda x: '%s="%s"' % (x, x), defargs)
|
||||
if vararg: arglist.append('*' + vararg)
|
||||
if kwarg: arglist.append('**' + kwarg)
|
||||
decl = 'def %s(%s): print "ok %s", a, b, d, e, v, k' % (
|
||||
name, ', '.join(arglist), name)
|
||||
decl = (('def %s(%s): print "ok %s", a, b, d, e, v, ' +
|
||||
'type(k) is type ("") and k or sortdict(k)')
|
||||
% (name, ', '.join(arglist), name))
|
||||
exec(decl)
|
||||
func = eval(name)
|
||||
funcs.append(func)
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue