loops. Renamed DATA and BINDATA to DATA0 and DATA1. Included
disassemblies, but noted why we can't test them. Added XXX comment to
cPickle about a mysterious comment, where pickle and cPickle diverge
in how they number PUT indices.
Assorted code cleanups; e.g., sizeof(char) is 1 by definition, so there's
no need to do things like multiply by sizeof(char) in hairy malloc
arguments. Fixed an undetected-overflow bug in readline_file().
longobject.c: Fixed a really stupid bug in the new _PyLong_NumBits.
pickle.py: Fixed stupid bug in save_long(): When proto is 2, it
wrote LONG1 or LONG4, but forgot to return then -- it went on to
append the proto 1 LONG opcode too.
Fixed equally stupid cancelling bugs in load_long1() and
load_long4(): they *returned* the unpickled long instead of pushing
it on the stack. The return values were ignored. Tests passed
before only because save_long() pickled the long twice.
Fixed bugs in encode_long().
Noted that decode_long() is quadratic-time despite our hopes,
because long(string, 16) is still quadratic-time in len(string).
It's hex() that's linear-time. I don't know a way to make decode_long()
linear-time in Python, short of maybe transforming the 256's-complement
bytes into marshal's funky internal format, and letting marshal decode
that. It would be more valuable to make long(string, 16) linear time.
pickletester.py: Added a global "protocols" vector so tests can try
all the protocols in a sane way. Changed test_ints() and test_unicode()
to do so. Added a new test_long(), but the tail end of it is disabled
because it "takes forever" under pickle.py (but runs very quickly under
cPickle: cPickle proto 2 for longs is linear-time).
The attached patch enables shared extension
modules to build cleanly under Cygwin without
moving the static initialization of certain function
pointers (i.e., ones exported from the Python
DLL core) to a module initialization function.
Additionally, this patch fixes the modules that
have been changed in the past to accommodate
Cygwin.
The staticforward define was needed to support certain broken C
compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the
static keyword when it was used with a forward declaration of a static
initialized structure. Standard C allows the forward declaration with
static, and we've decided to stop catering to broken C compilers. (In
fact, we expect that the compilers are all fixed eight years later.)
I'm leaving staticforward and statichere defined in object.h as
static. This is only for backwards compatibility with C extensions
that might still use it.
XXX I haven't updated the documentation.
PyImport_ImportModule() is not guaranteed to return a module object.
When another type of object was returned, the PyModule_GetDict() call
return NULL and the subsequent GetItem() seg faulted.
Bug fix candidate.
don't understand how this function works, also beefed up the docs. The
most common usage error is of this form (often spread out across gotos):
if (_PyString_Resize(&s, n) < 0) {
Py_DECREF(s);
s = NULL;
goto outtahere;
}
The error is that if _PyString_Resize runs out of memory, it automatically
decrefs the input string object s (which also deallocates it, since its
refcount must be 1 upon entry), and sets s to NULL. So if the "if"
branch ever triggers, it's an error to call Py_DECREF(s): s is already
NULL! A correct way to write the above is the simpler (and intended)
if (_PyString_Resize(&s, n) < 0)
goto outtahere;
Bugfix candidate.
Change pickling format for bools to use a backwards compatible
encoding. This means you can pickle True or False on Python 2.3
and Python 2.2 or before will read it back as 1 or 0. The code
used for pickling bools before would create pickles that could
not be read in previous Python versions.
PEP 285. Everything described in the PEP is here, and there is even
some documentation. I had to fix 12 unit tests; all but one of these
were printing Boolean outcomes that changed from 0/1 to False/True.
(The exception is test_unicode.py, which did a type(x) == type(y)
style comparison. I could've fixed that with a single line using
issubtype(x, type(y)), but instead chose to be explicit about those
places where a bool is expected.
Still to do: perhaps more documentation; change standard library
modules to return False/True from predicates.
metaclass, reported by Dan Parisien.
Objects that are instances of custom metaclasses, i.e. whose ob_type
is a subclass of PyType_Type, should be pickled the same as new-style
classes (objects whose ob_type is PyType_Type). This can't be done
through the existing dispatch switches, and the __reduce__ trick
doesn't work for these, since it finds the unbound __reduce__ for
instances of the class (inherited from PyBaseObject_Type). So check
explicitly using PyType_IsSubtype().
type.__module__ behavior.
This adds the module name and a dot in front of the type name in every
type object initializer, except for built-in types (and those that
already had this). Note that it touches lots of Mac modules -- I have
no way to test these but the changes look right. Apologies if they're
not. This also touches the weakref docs, which contains a sample type
object initializer. It also touches the mmap test output, because the
mmap type's repr is included in that output. It touches object.h to
put the correct description in a comment.
find_class(): We no longer mask all exceptions[1] by transforming them
into SystemError. The latter is definitely not the right thing to do,
so we let any exceptions that occur in the PyObject_GetAttr() call to
simply propagate up if they occur.
[1] Note that pickle only masked ImportError, KeyError, and
AttributeError, but cPickle masked all exceptions.
Raise ValueError when an object contains an arbitrarily nested
reference to itself. (The previous fix just produced invalid
pickles.)
Solution is very much like Py_ReprEnter() and Py_ReprLeave():
fast_save_enter() and fast_save_leave() that tracks the fast_container
limit and keeps a fast_memo of objects currently being pickled.
The cost of the solution is moderately expensive for deeply nested
structures, but it still seems to be faster than normal pickling,
based on tests with deeply nested lists.
Once FAST_LIMIT is exceeded, the new code is about twice as slow as
fast-mode code that doesn't check for recursion. It's still twice as
fast as the normal pickling code. In the absence of deeply nested
structures, I couldn't measure a difference.
Add a fast_container member to Picklerobject. If fast is true, then
fast_container counts the depth of nested container calls. If the
depth exceeds FAST_LIMIT (2000), the fast flag is ignored and the
normal checks occur. This approach is much like the approach for
prevent stack overflow for comparison and reprs of recursive objects
(e.g. [[...]]).
- Fast container used for save_list(), save_dict(), and
save_inst().
XXX Not clear which other save_xxx() functions should use it.
Make Picklerobject into new-style types, using PyObject_GenericGetAttr()
and PyObject_GenericSetAttr().
- Use PyMemberDef for binary and fast members
- Use PyGetSetDef for persistent_id, inst_persistent_id, memo, and
PicklingError.
XXX Not all of these seem like they need to use getset, but it's
not clear why the old getattr() and setattr() had such odd
semantics. One change is that the getvalue() attribute will
exist on all Picklers, not just list-based picklers; I think
this is a more rationale interface.
There is a long laundry list of other changes:
- Remove unused #defines for PyList_SET_ITEM() etc.
- Make some of the indentation consistent
- Replace uses of cPickle_PyMapping_HasKey() where the first
argument is self->memo with calls to PyDict_GetItem(), because
self->memo must be a dictionary.
- Don't bother to check if cPickle_PyMapping_HasKey() returns < 0,
because it can only return 0 or 1.
- Replace uses of PyObject_CallObject() with PyObject_Call(), when
we can guarantee that the argument tuple is really a tuple.
Performance impacts of these changes:
- 5% speedup for normal pickling
- No change to fast-mode pickling.
XXX Really need tests for all the features in cPickle that aren't in
pickle.