Commit graph

1249 commits

Author SHA1 Message Date
Victor Stinner
cfed46e00a PyUnicode_FromKindAndData() fails with a ValueError if size < 0 2011-11-22 01:29:14 +01:00
Victor Stinner
42885206ec UTF-8 decoder: set consumed value in the latin1 fast-path 2011-11-22 01:23:02 +01:00
Victor Stinner
d3df8ab377 Replace _PyUnicode_READY_REPLACE() and _PyUnicode_ReadyReplace() with unicode_ready()
* unicode_ready() has a simpler API
 * try to reuse unicode_empty and latin1_char singleton everywhere
 * Fix a reference leak in _PyUnicode_TranslateCharmap()
 * PyUnicode_InternInPlace() doesn't try to get a singleton anymore, to avoid
   having to handle a failure
2011-11-22 01:22:34 +01:00
Victor Stinner
f01245067a Rewrite PyUnicode_TransformDecimalToASCII() to use the new Unicode API 2011-11-21 23:12:56 +01:00
Victor Stinner
2d718f39a5 Remove an unused variable from PyUnicode_Copy() 2011-11-21 23:11:52 +01:00
Victor Stinner
87af4f2f3a Simplify PyUnicode_Copy()
USe PyUnicode_Copy() in fixup()
2011-11-21 23:03:47 +01:00
Victor Stinner
5bbe5e7c85 Fix a compiler warning in _PyUnicode_CheckConsistency() 2011-11-21 22:54:05 +01:00
Victor Stinner
42bf77537e Rewrite PyUnicode_EncodeDecimal() to use the new Unicode API
Add tests for PyUnicode_EncodeDecimal() and
PyUnicode_TransformDecimalToASCII().
2011-11-21 22:52:58 +01:00
Antoine Pitrou
0a3229de6b Issue #13417: speed up utf-8 decoding by around 2x for the non-fully-ASCII case.
This almost catches up with pre-PEP 393 performance, when decoding needed
only one pass.
2011-11-21 20:39:13 +01:00
Victor Stinner
da29cc36aa Issue #13441: _PyUnicode_CheckConsistency() dumps the string if the maximum
character is bigger than U+10FFFF and locale.localeconv() dumps the string
before decoding it.

Temporary hack to debug the issue #13441.
2011-11-21 14:31:41 +01:00
Victor Stinner
9e30aa52fd Fix misuse of PyUnicode_GET_SIZE() => PyUnicode_GET_LENGTH()
And PyUnicode_GetSize() => PyUnicode_GetLength()
2011-11-21 02:49:52 +01:00
Victor Stinner
337986740f Issue #26464: Fix unicode_fast_translate() again
Initialize i variable if the string is non-ASCII.
2016-03-01 21:59:58 +01:00
Victor Stinner
6c9aa8f2bf Fix str.translate()
Issue #26464: Fix str.translate() when string is ASCII and first replacements
removes character, but next replacement uses a non-ASCII character or a string
longer than 1 character. Regression introduced in Python 3.5.0.
2016-03-01 21:30:30 +01:00
Victor Stinner
5bc03a6d4d Fix resize_compact()
Issue #26217: resize_compact() must set wstr_length to 0 after freeing the wstr
string. Otherwise, an assertion fails in _PyUnicode_CheckConsistency().
2016-01-27 16:56:53 +01:00
Serhiy Storchaka
191321d11b Issue #20440: More use of Py_SETREF.
This patch is manually crafted and contains changes that couldn't be handled
automatically.
2015-12-27 15:41:34 +02:00
Serhiy Storchaka
5a57ade58e Issue #20440: Massive replacing unsafe attribute setting code with special
macro Py_SETREF.
2015-12-24 10:35:59 +02:00
Serhiy Storchaka
6648bf5661 Issue #25709: Fixed problem with in-place string concatenation and utf-8 cache. 2015-12-03 01:04:37 +02:00
Benjamin Peterson
a4d33b3428 make the PyUnicode_FSConverter cleanup set the decrefed argument to NULL (closes #25630) 2015-11-15 21:57:39 -08:00
Serhiy Storchaka
a84f6c3dd3 Issue #25523: Merge a-to-an corrections from 3.4. 2015-11-02 14:39:05 +02:00
Serhiy Storchaka
58c8f2bb6d Issue #24848: Fixed bugs in UTF-7 decoding of misformed data:
1. Non-ASCII bytes were accepted after shift sequence.
2. A low surrogate could be emitted in case of error in high surrogate.
3. In some circumstances the '\xfd' character was produced instead of the
replacement character '\ufffd' (due to a bug in _PyUnicodeWriter).
2015-10-02 13:13:14 +03:00
Zachary Ware
d987a81d29 Issue #21279: Merge with 3.4 2015-08-06 00:04:23 -05:00
Serhiy Storchaka
d4ea03c785 Issue #24284: The startswith and endswith methods of the str class no longer
return True when finding the empty string and the indexes are completely out
of range.
2015-05-31 09:15:51 +03:00
Antoine Pitrou
873e0df946 Fix some compilation warnings when using gcc (-Wmaybe-uninitialized). 2015-05-19 21:06:04 +02:00
Serhiy Storchaka
0d4df752ac Issue #15027: The UTF-32 encoder is now 3x to 7x faster. 2015-05-12 23:12:45 +03:00
Serhiy Storchaka
7e9d1d1a1b Issue #23908: os functions now reject paths with embedded null character
on Windows instead of silently truncate them.

Removed no longer used _PyUnicode_HasNULChars().
2015-04-20 10:12:28 +03:00
Serhiy Storchaka
1009bf18b3 Issue #23501: Argumen Clinic now generates code into separate files by default. 2015-04-03 23:53:51 +03:00
Victor Stinner
1912b39def _PyUnicodeWriter_WriteStr() now checks that the input string is consistent
in debug mode to detect bugs earlier.

_PyUnicodeWriter_Finish() doesn't check if the read only string is consistent,
whereas it does check consistency for strings built by itself.
2015-03-26 09:37:23 +01:00
Serhiy Storchaka
d9d769fcdd Issue #23573: Increased performance of string search operations (str.find,
str.index, str.count, the in operator, str.split, str.partition) with
arguments of different kinds (UCS1, UCS2, UCS4).
2015-03-24 21:55:47 +02:00
Victor Stinner
f50e187724 Fix compiler warnings: comparison between signed and unsigned numbers 2015-03-20 11:32:24 +01:00
Victor Stinner
0c39b1b970 Initialize variables to prevent GCC warnings 2015-03-18 15:02:06 +01:00
Steve Dower
3e96f324dc Issue #23451: Update pyconfig.h for Windows to require Vista headers and remove unnecessary version checks. 2015-03-02 08:01:10 -08:00
Serhiy Storchaka
78a8249127 Issue #23490: Fixed possible crashes related to interoperability between
old-style and new API for string with 2**30-1 characters.
2015-02-20 21:34:39 +02:00
Serhiy Storchaka
4d0d982985 Issue #23446: Use PyMem_New instead of PyMem_Malloc to avoid possible integer
overflows.  Added few missed PyErr_NoMemory().
2015-02-16 13:33:32 +02:00
Victor Stinner
29dacf2e97 Issue #15859: PyUnicode_EncodeFSDefault(), PyUnicode_EncodeMBCS() and
PyUnicode_EncodeCodePage() now raise an exception if the object is not an
Unicode object. For PyUnicode_EncodeFSDefault(), it was already the case on
platforms other than Windows. Patch written by Campbell Barton.
2015-01-26 16:41:32 +01:00
Serhiy Storchaka
bbd3aa8ece Issue #23321: Fixed a crash in str.decode() when error handler returned
replacment string longer than mailformed input data.
2015-01-26 01:24:31 +02:00
Ethan Furman
b95b56150f Issue20284: Implement PEP461 2015-01-23 20:05:18 -08:00
Serhiy Storchaka
82e07b92b3 Issue #23181: More "codepoint" -> "code point". 2015-01-18 11:33:31 +02:00
Serhiy Storchaka
92bf919ed0 Issue #22581: Use more "bytes-like object" throughout the docs and comments. 2014-12-05 22:26:10 +02:00
Serhiy Storchaka
407249c62b Issue #22975: Close block at right place. 2014-12-01 18:56:54 +02:00
Victor Stinner
3aa979e0cd Issue #20948: Inline makefmt() in unicode_fromformat_arg() 2014-11-18 21:40:51 +01:00
Antoine Pitrou
4e334241b7 Fixed signed/unsigned comparison warning 2014-10-15 23:14:53 +02:00
Benjamin Peterson
736982d36d merge 3.4 (closes #22643) 2014-10-15 12:17:47 -04:00
Benjamin Peterson
315aa40403 Merge 3.4 2014-10-15 11:51:17 -04:00
Benjamin Peterson
6925264334 merge 3.4 (#22643) 2014-10-15 11:49:15 -04:00
Gregory P. Smith
8486f9b134 Fix "warning: comparison between signed and unsigned integer expressions"
-Wsign-compare warnings in unicodeobject.c.  These were all a result
of sizeof() being unsigned and being compared to a Py_ssize_t.
Not actual problems.
2014-09-30 00:33:24 -07:00
Benjamin Peterson
fd97a6fb2d merge 3.4 (#22520) 2014-09-29 23:02:56 -04:00
Benjamin Peterson
10e4b2545e merge 3.4 (closes #22518) 2014-09-29 18:53:58 -04:00
Serhiy Storchaka
20b39b27d9 Removed redundant casts to char *.
Corresponding functions now accept `const char *` (issue #1772673).
2014-09-28 11:27:24 +03:00
Serhiy Storchaka
d8a1447c99 Issue #22215: Now ValueError is raised instead of TypeError when str or bytes
argument contains not permitted null character or byte.
2014-09-06 20:07:17 +03:00
Victor Stinner
12174a5dca Issue #22156: Fix "comparison between signed and unsigned integers" compiler
warnings in the Objects/ subdirectory.

PyType_FromSpecWithBases() and PyType_FromSpec() now reject explicitly negative
slot identifiers.
2014-08-15 23:17:38 +02:00