Commit graph

784 commits

Author SHA1 Message Date
Guido van Rossum
fd71b9e9d4 Change copyright notice. 2000-06-30 23:50:40 +00:00
Guido van Rossum
9a15c211cf Fix an error on AIX by using a proper cast. 2000-06-30 22:46:04 +00:00
Fred Drake
a44d353e2b Trent Mick <trentm@activestate.com>:
The common technique for printing out a pointer has been to cast to a long
and use the "%lx" printf modifier. This is incorrect on Win64 where casting
to a long truncates the pointer. The "%p" formatter should be used instead.

The problem as stated by Tim:
> Unfortunately, the C committee refused to define what %p conversion "looks
> like" -- they explicitly allowed it to be implementation-defined. Older
> versions of Microsoft C even stuck a colon in the middle of the address (in
> the days of segment+offset addressing)!

The result is that the hex value of a pointer will maybe/maybe not have a 0x
prepended to it.


Notes on the patch:

There are two main classes of changes:
- in the various repr() functions that print out pointers
- debugging printf's in the various thread_*.h files (these are why the
patch is large)


Closes SourceForge patch #100505.
2000-06-30 15:01:00 +00:00
Marc-André Lemburg
d49e5b4667 Marc-Andre Lemburg <mal@lemburg.com>:
A previous patch by Jack Jansen was accidently reverted.
2000-06-30 14:58:20 +00:00
Marc-André Lemburg
f28dd83b86 Marc-Andre Lemburg <mal@lemburg.com>:
New buffer overflow checks for formatting strings.

By Trent Mick.
2000-06-30 10:29:57 +00:00
Jeremy Hylton
c5007aa5c3 final patches from Neil Schemenauer for garbage collection 2000-06-30 05:02:53 +00:00
Fred Drake
13634cf7a4 This patch addresses two main issues: (1) There exist some non-fatal
errors in some of the hash algorithms. For exmaple, in float_hash and
complex_hash a certain part of the value is not included in the hash
calculation. See Tim's, Guido's, and my discussion of this on
python-dev in May under the title "fix float_hash and complex_hash for
64-bit *nix"

(2) The hash algorithms that use pointers (e.g. func_hash, code_hash)
are universally not correct on Win64 (they assume that sizeof(long) ==
sizeof(void*))

As well, this patch significantly cleans up the hash code. It adds the
two function _Py_HashDouble and _PyHash_VoidPtr that the various
hashing routine are changed to use.

These help maintain the hash function invariant: (a==b) =>
(hash(a)==hash(b))) I have added Lib/test/test_hash.py and
Lib/test/output/test_hash to test this for some cases.
2000-06-29 19:17:04 +00:00
Guido van Rossum
4f4b799b33 Jack Jansen: Use include "" instead of <>; and staticforward declarations 2000-06-29 00:06:39 +00:00
Guido van Rossum
d7823f2645 Vladimir Marangozov:
Avoid calling the dealloc function, previously triggered with
DECREF(inst).  This caused a segfault in PyDict_GetItem, called with a
NULL dict, whenever inst->in_dict fails under low-memory conditions.
2000-06-28 23:46:07 +00:00
Guido van Rossum
ad89bbcd88 Trent Mick: change a few casts for Win64 compatibility. 2000-06-28 21:57:18 +00:00
Guido van Rossum
eceebb87d9 Jack Jansen: Moved includes to the top, removed think C support 2000-06-28 20:57:07 +00:00
Marc-André Lemburg
0f774e3987 Marc-Andre Lemburg <mal@lemburg.com>:
Patch to the standard unicode-escape codec which dynamically
loads the Unicode name to ordinal mapping from the module
ucnhash.

By Bill Tutt.
2000-06-28 16:43:35 +00:00
Marc-André Lemburg
7c014684c2 Marc-Andre Lemburg <mal@lemburg.com>:
Better error message for "1 in unicodestring". Submitted
by Andrew Kuchling.
2000-06-28 08:11:47 +00:00
Jeremy Hylton
d08b4c4524 part 2 of Neil Schemenauer's GC patches:
This patch modifies the type structures of objects that
participate in GC.  The object's tp_basicsize is increased when
GC is enabled.  GC information is prefixed to the object to
maintain binary compatibility.  GC objects also define the
tp_flag Py_TPFLAGS_GC.
2000-06-23 19:37:02 +00:00
Jeremy Hylton
d22162bac7 traverse functions should return 0 on success 2000-06-23 17:14:56 +00:00
Jeremy Hylton
99a8f90874 raise TypeError when PyObject_Get/SetAttr called with non-string name 2000-06-23 14:36:32 +00:00
Jeremy Hylton
8caad49c30 Round 1 of Neil Schemenauer's GC patches:
This patch adds the type methods traverse and clear necessary for GC
implementation.
2000-06-23 14:18:11 +00:00
Fred Drake
396f6e0d6a Fredrik Lundh <effbot@telia.com>:
Simplify find code; this is a performance improvement on at least some
platforms.
2000-06-20 15:47:54 +00:00
Marc-André Lemburg
49ef6dc1f4 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a bug in PyUnicode_Count() which would have caused a
core dump in case of substring coercion failure.

Synchronized .count() with the string method of the same name
to return len(s)+1 for s.count('').
2000-06-18 22:25:22 +00:00
Andrew M. Kuchling
74042d6e5d Patch from /F:
this patch introduces PySequence_Fast and PySequence_Fast_GET_ITEM,
and modifies the list.extend method to accept any kind of sequence.
2000-06-18 18:43:14 +00:00
Marc-André Lemburg
bea47e768d Vladimir MARANGOZOV <Vladimir.Marangozov@inrialpes.fr>:
This patch fixes an optimisation mystery in _PyUnicodeNew causing segfaults
on AIX when the interpreter is compiled with -O.
2000-06-17 20:31:17 +00:00
Marc-André Lemburg
29dc381ce0 Michael Hudson <mwh21@cam.ac.uk>:
The error message refers to "append", yet the operation in
question is "concat".
2000-06-16 17:05:57 +00:00
Fred Drake
56780257c6 Thomas Wouters <thomas@xs4all.net>:
The following patch adds "sq_contains" support to rangeobject, and enables
the already-written support for sq_contains in listobject and tupleobject.

The rangeobject "contains" code should be a bit more efficient than the
current default "in" implementation ;-) It might not get used much, but it's
not that much to add.

listobject.c and tupleobject.c already had code for sq_contains, and the
proper struct member was set, but the PyType structure was not extended to
include tp_flags, so the object-specific code was not getting called (Go
ahead, test it ;-). I also did this for the immutable_list_type in
listobject.c, eventhough it is probably never used. Symmetry and all that.
2000-06-15 14:50:20 +00:00
Marc-André Lemburg
60bc809d9a Marc-Andre Lemburg <mal@lemburg.com>:
Added code so that .isXXX() testing returns 0 for emtpy strings.
2000-06-14 09:18:32 +00:00
Marc-André Lemburg
07ceb67d9c Marc-Andre Lemburg <mal@lemburg.com>:
Fixed a typo and removed a debug printf(). Thanks to Finn Bock
for finding these.
2000-06-10 09:32:51 +00:00
Jeremy Hylton
a251ea0680 the PyDict_SetItem does not borrow a reference, so we need to decref
reported by Mark Hammon
2000-06-09 16:20:39 +00:00
Andrew M. Kuchling
cb95a1470a Patch from Michael Hudson: improve unclear error message 2000-06-09 14:04:53 +00:00
Marc-André Lemburg
d4ab4a5905 Marc-Andre Lemburg <mal@lemburg.com>:
Fixed %c formatting to check for one character arguments. Thanks
to Finn Bock for finding this bug.

Added a fix for bug PR#348 which originated from not resetting
the globals correctly in _PyUnicode_Fini().
2000-06-08 17:54:00 +00:00
Marc-André Lemburg
90e8147118 Marc-Andre Lemburg <mal@lemburg.com>:
Change the default encoding to 'ascii' (it was previously
defined as UTF-8).

Note: The implementation still uses UTF-8 to implement
the buffer protocol, so C APIs will still see UTF-8. This
is on purpose: rather than fixing the Unicode implementation,
the C APIs should be made Unicode aware.
2000-06-07 09:13:21 +00:00
Fred Drake
4c7fdfc35b Trent Mick <trentm@ActiveState.com>:
This patch correct bounds checking in PyLong_FromLongLong. Currently, it does
not check properly for negative values when checking to see if the incoming
value fits in a long or unsigned long. This results in possible silent
truncation of the value for very large negative values.
2000-06-01 18:37:36 +00:00
Fred Drake
914a2edb24 Improve TypeError exception message for list catenation. 2000-06-01 14:31:03 +00:00
Fred Drake
b6a9ada757 Michael Hudson <mwh21@cam.ac.uk>:
Removed PyErr_BadArgument() calls and replaced them with more useful
error messages.
2000-06-01 03:12:13 +00:00
Fred Drake
785d14f965 Minimal change so I can add the rest of MAL's checkin message:
M.-A. Lemburg <mal@lemburg.com>:
Fixed a core dump in PyUnicode_Format().
2000-05-09 19:54:43 +00:00
Fred Drake
e4315f58d2 M.-A. Lemburg <mal@lemburg.com>:
Added support for user settable default encodings. The
current implementation uses a per-process global which
defines the value of the encoding parameter in case it
is set to NULL (meaning: use the default encoding).
2000-05-09 19:53:39 +00:00
Guido van Rossum
c18a6f466a Replace PyErr_BadArgument() error in PyInt_AsLong() with "an integer
is required" (we can't say more because we don't know in which context
it is called).
2000-05-09 14:27:48 +00:00
Guido van Rossum
b8872e61c6 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]).
2000-05-09 14:14:27 +00:00
Guido van Rossum
c682140de7 Trent Mick:
Fix the string methods that implement slice-like semantics with
optional args (count, find, endswith, etc.) to properly handle
indeces outside [INT_MIN, INT_MAX]. Previously the "i" formatter
for PyArg_ParseTuple was used to get the indices. These could overflow.

This patch changes the string methods to use the "O&" formatter with
the slice_index() function from ceval.c which is used to do the same
job for Python code slices (e.g. 'abcabcabc'[0:1000000000L]). slice_index()
is renamed _PyEval_SliceIndex() and is now exported. As well, the return
values for success/fail were changed to make slice_index directly
usable as required by the "O&" formatter.

[GvR: shouldn't a similar patch be applied to unicodeobject.c?]
2000-05-08 14:08:05 +00:00
Guido van Rossum
b8f820c5a9 The methods islower(), isupper(), isspace(), isdigit() and istitle()
gave bogus results for chars in the range 128-255, because their
implementation was using signed characters.  Fixed this by using
unsigned character pointers (as opposed to using Py_CHARMASK()).
2000-05-05 20:44:24 +00:00
Guido van Rossum
03e29f1ae9 Mark Hammond should get his act into gear (his words :-). Zero length
strings _are_ valid!
2000-05-04 15:52:20 +00:00
Guido van Rossum
42c29aaeb5 Fix warning detected by VC++ on assignment of Py_UNICODE to char. 2000-05-03 23:58:29 +00:00
Guido van Rossum
b18618dab7 Vladimir Marangozov's long-awaited malloc restructuring.
For more comments, read the patches@python.org archives.
For documentation read the comments in mymalloc.h and objimpl.h.

(This is not exactly what Vladimir posted to the patches list; I've
made a few changes, and Vladimir sent me a fix in private email for a
problem that only occurs in debug mode.  I'm also holding back on his
change to main.c, which seems unnecessary to me.)
2000-05-03 23:44:39 +00:00
Guido van Rossum
4e751c3d12 Mark Hammond withdraws his fix -- the size includes the trailing 0 so
a size of 0 *is* illegal.
2000-05-03 12:27:22 +00:00
Guido van Rossum
a6edfd9737 Mark Hammond:
Fixes the MBCS codec to work correctly with zero length strings.
2000-05-03 11:03:24 +00:00
Barry Warsaw
ee98e4e75d Ignore a bunch of generated files. 2000-05-02 18:34:30 +00:00
Guido van Rossum
0e4f657a50 Marc-Andre Lemburg:
Fixed \OOO interpretation for Unicode objects. \777 now
correctly produces the Unicode character with ordinal 511.
2000-05-01 21:27:20 +00:00
Jeremy Hylton
37b1a26c89 add list_contains and tuplecontains: efficient implementations of tp_contains 2000-04-27 21:41:03 +00:00
Guido van Rossum
ec5b776998 Marc-Andre Lemburg:
Doc strings can now be given as Unicode strings.
2000-04-27 20:14:13 +00:00
Guido van Rossum
3c1bb8043f Marc-Andre Lemburg:
Fixed a reference leak in the allocator.

Renamed utf8_string to _PyUnicode_AsUTF8String() and made
it external for use by other parts of the interpreter.
2000-04-27 20:13:50 +00:00
Jeremy Hylton
9e392e2412 potentially useless optimization
The previous checkin (2.84) added a PyErr_Format call that made the
cost of raising an AttributeError much more expensive.  In general
this doesn't matter, except that checks for __init__ and
__del__ methods, where exceptions are caught and cleared in C, also
got much more expensive.

The fix is to split instance_getattr1 into two calls:

instance_getattr2 checks the instance and the class for the attribute
and returns it or returns NULL on error.  It does not raise an
exception.

instance_getattr1 does rexec checks, then calls instance_getattr2.  It
raises an exception if instance_getattr2 returns NULL.

PyInstance_New and instance_dealloc now call instance_getattr2
directly.
2000-04-26 20:39:20 +00:00
Guido van Rossum
e92e610a9e Christian Tismer -- total rewrite on trashcan code.
Improvements:
- does no longer need any extra memory
- has no relationship to tstate
- works in debug mode
- can easily be modified for free threading (hi Greg:)

Side effects:
Trashcan does change the order of object destruction.
Prevending that would be quite an immense effort, as
my attempts have shown. This version works always
the same, with debug mode or not. The slightly
changed destruction order should therefore be no problem.

Algorithm:
While the old idea of delaying the destruction of some
obejcts at a certain recursion level was kept, we now
no longer aloocate an object to hold these objects.
The delayed objects are instead chained together
via their ob_type field. The type is encoded via
ob_refcnt. When it comes to the destruction of the
chain of waiting objects, the topmost object is popped
off the chain and revived with type and refcount 1,
then it gets a normal Py_DECREF.

I am confident that this solution is near optimum
for minimizing side effects and code bloat.
2000-04-24 15:40:53 +00:00