Commit graph

1520 commits

Author SHA1 Message Date
Victor Stinner
15a1136547 Issue #16147: PyUnicode_FromFormatV() doesn't need anymore to allocate a buffer
on the heap to format numbers.
2012-10-06 23:48:20 +02:00
Victor Stinner
ff5a848db5 Issue #16147: PyUnicode_FromFormatV() now raises an error if the argument of
'%c' is not in the range(0x110000).
2012-10-06 23:05:45 +02:00
Victor Stinner
3921e90c5a Issue #16147: PyUnicode_FromFormatV() now detects integer overflow when parsing
width and precision
2012-10-06 23:05:00 +02:00
Victor Stinner
e215d960be Issue #16147: Rewrite PyUnicode_FromFormatV() to use _PyUnicodeWriter API
* Simplify the code: replace 4 steps with one unique step using the
   _PyUnicodeWriter API. PyUnicode_Format() has the same design. It avoids to
   store intermediate results which require to allocate an array of pointers on
   the heap.
 * Use the _PyUnicodeWriter API for speed (and its convinient API):
   overallocate the buffer to reduce the number of "realloc()"
 * Implement "width" and "precision" in Python, don't rely on sprintf(). It
   avoids to need of a temporary buffer allocated on the heap: only use a small
   buffer allocated in the stack.
 * Add _PyUnicodeWriter_WriteCstr() function
 * Split PyUnicode_FromFormatV() into two functions: add
   unicode_fromformat_arg().
 * Inline parse_format_flags(): the format of an argument is now only parsed
   once, it's no more needed to have a subfunction.
 * Optimize PyUnicode_FromFormatV() for characters between two "%" arguments:
   search the next "%" and copy the substring in one chunk, instead of copying
   character per character.
2012-10-06 23:03:36 +02:00
Mark Dickinson
ff9c54aca2 Issue #16096: Merge fixes from 3.3. 2012-10-06 18:05:14 +01:00
Mark Dickinson
c04ddff290 Issue #16096: Fix several occurrences of potential signed integer overflow. Thanks Serhiy Storchaka. 2012-10-06 18:04:49 +01:00
Victor Stinner
8c6db45d3e In debug mode, unicode_write_cstr() now checks that non-ASCII characters are
not written into an ASCII string
2012-10-06 00:40:45 +02:00
Ezio Melotti
080a2c087e #16127: merge with 3.3. 2012-10-05 03:34:02 +03:00
Ezio Melotti
e7f90375b1 #16127: remove outdated references to narrow builds. Patch by Serhiy Storchaka. 2012-10-05 03:33:31 +03:00
Victor Stinner
1929407406 Fix PyUnicode_Format(): return NULL if PyUnicode_READY(uformat) failed
This error cannot occur in practice: PyUnicode_FromObject() always return
a "ready" string.
2012-10-05 00:09:33 +02:00
Victor Stinner
770e19e0cc Optimize unicode_compare(): use memcmp() when comparing two UCS1 strings 2012-10-04 22:59:45 +02:00
Victor Stinner
90db9c47dc Enable also ptr==ptr optimization in PyUnicode_Compare()
It was already implemented in PyUnicode_RichCompare()
2012-10-04 21:53:50 +02:00
Victor Stinner
aa7712711d unicode_result_wchar(): move the assert() to the "#ifdef Py_DEBUG" block 2012-10-04 02:32:58 +02:00
Victor Stinner
a4708231e6 Split the huge PyUnicode_Format() function (+540 lines) into subfunctions 2012-10-04 02:19:54 +02:00
Victor Stinner
a049443fab PyUnicode_Format(): disable overallocation when we are writing the last part
of the output string
2012-10-03 23:03:46 +02:00
Victor Stinner
afffce489b Unicode: resize_compact() and resize_inplace() fills also the Unicode strings
with invalid bytes in debug mode, as done by PyUnicode_New()
2012-10-03 23:03:17 +02:00
Victor Stinner
c89d28fdfc Issue #15609: Fix refleak introduced by my last optimization 2012-10-02 12:54:07 +02:00
Victor Stinner
621ef3d84f Issue #15609: Optimize str%args for integer argument
- Use _PyLong_FormatWriter() instead of formatlong() when possible, to avoid
   a temporary buffer
 - Enable the fast path when width is smaller or equals to the length,
   and when the precision is bigger or equals to the length
 - Add unit tests!
 - formatlong() uses PyUnicode_Resize() instead of _PyUnicode_FromASCII()
   to resize the output string
2012-10-02 00:33:47 +02:00
Antoine Pitrou
a1f7655fa7 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 20:00:04 +02:00
Antoine Pitrou
6f80f5d444 Issue #15379: Fix passing of non-BMP characters as integers for the charmap decoder (already working as unicode strings).
Patch by Serhiy Storchaka.
2012-09-23 19:55:21 +02:00
Antoine Pitrou
ca8aa4acf6 Issue #15144: Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t.
Patch by Serhiy Storchaka.
2012-09-20 20:56:47 +02:00
Christian Heimes
5f520f4fed Issue #15900: Fixed reference leak in PyUnicode_TranslateCharmap() 2012-09-11 14:03:25 +02:00
Christian Heimes
f4f9939a96 Fixed memory leak in error branch of formatfloat(). CID 719687 2012-09-10 11:48:41 +02:00
Antoine Pitrou
057119b0b7 Fix C++-style comment (xlc compilation failure) 2012-09-02 17:56:33 +02:00
Benjamin Peterson
59043f96ea merge 3.2 (#15801) 2012-08-28 18:01:45 -04:00
Benjamin Peterson
28a6cfaefc use the stricter PyMapping_Check (closes #15801) 2012-08-28 17:55:35 -04:00
Stefan Krah
8528c3145e Issue #15728: Fix leak in PyUnicode_AsWideCharString(). Found by Coverity. 2012-08-19 21:52:43 +02:00
Nick Coghlan
0e41628d35 Merge str docstring fix from 3.2 2012-08-16 14:14:30 +10:00
Nick Coghlan
573b1fd779 Fix str docstring 2012-08-16 14:13:07 +10:00
Antoine Pitrou
b4bbee25b1 Issue #14579: Fix CVE-2012-2135: vulnerability in the utf-16 decoder after error handling.
Patch by Serhiy Storchaka.
2012-07-21 00:45:14 +02:00
Mark Dickinson
01ac8b6ab1 Use correct types for ASCII_CHAR_MASK integer constants. 2012-07-07 14:08:48 +02:00
Antoine Pitrou
aaefac76dd Issue #14874: Restore charmap decoding speed to pre-PEP 393 levels.
Patch by Serhiy Storchaka.
2012-06-16 22:48:21 +02:00
Victor Stinner
f185226244 _copy_characters(): move debug code at the top to avoid noisy #ifdef
And don't use assert() anymore if check_maxchar is set: return -1 on error
instead.
2012-06-16 16:38:26 +02:00
Victor Stinner
07621338fb Fix PyUnicode_GetSize(): Don't replace _PyUnicode_Ready() exception 2012-06-16 04:53:46 +02:00
Victor Stinner
8a8b3eaabe Fix a compiler warning in _copy_characters() and remove debug code 2012-06-16 04:53:25 +02:00
Victor Stinner
24e403bbee Oops, fix my previous change on _copy_characters() 2012-06-16 04:53:00 +02:00
Victor Stinner
ca439eecea Fix unicode_adjust_maxchar(): catch PyUnicode_New() failure 2012-06-16 03:17:34 +02:00
Victor Stinner
184252ad3f Fix "%f" format of str%args if the result is not an ASCII or latin1 string 2012-06-16 02:57:41 +02:00
Victor Stinner
9a77770add Remove debug code 2012-06-16 02:44:43 +02:00
Victor Stinner
c9d369f1bf Optimize _PyUnicode_FastCopyCharacters() when maxchar(from) > maxchar(to) 2012-06-16 02:22:37 +02:00
Victor Stinner
f05e17ece9 unicodeobject.c: Remove debug code 2012-06-16 01:53:04 +02:00
Antoine Pitrou
27f6a3b0bf Issue #15026: utf-16 encoding is now significantly faster (up to 10x).
Patch by Serhiy Storchaka.
2012-06-15 22:15:23 +02:00
Kristján Valur Jónsson
55e5dc8371 Rearrange code to beat an optimizer bug affecting Release x64 on windows
with VS2010sp1
2012-06-06 21:58:08 +00:00
Victor Stinner
d7b7c7472b Issue #14993: Use standard "unsigned char" instead of a unsigned char bitfield 2012-06-04 22:52:12 +02:00
Kristjan Valur Jonsson
85634d7a2e Issue #14909: A number of places were using PyMem_Realloc() apis and
PyObject_GC_Resize() with incorrect error handling.  In case of errors,
the original object would be leaked.  This checkin fixes those cases.
2012-05-31 09:37:31 +00:00
Victor Stinner
3a7d096f2f Issue #14744: Fix compilation on Windows (part 2) 2012-05-29 18:53:56 +02:00
Victor Stinner
d3f0882dfb Issue #14744: Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args)
* Formatting string, int, float and complex use the _PyUnicodeWriter API. It
   avoids a temporary buffer in most cases.
 * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just
   keep a reference to the string if the output is only composed of one string
 * Disable overallocation when formatting the last argument of str%args and
   str.format(args)
 * Overallocation allocates at least 100 characters: add min_length attribute
   to the _PyUnicodeWriter structure
 * Add new private functions: _PyUnicode_FastCopyCharacters(),
   _PyUnicode_FastFill() and _PyUnicode_FromASCII()

The speed up is around 20% in average.
2012-05-29 12:57:52 +02:00
Antoine Pitrou
63065d761e Issue #14624: UTF-16 decoding is now 3x to 4x faster on various inputs.
Patch by Serhiy Storchaka.
2012-05-15 23:48:04 +02:00
Martin v. Löwis
b05c0738d8 Silence VS 2010 signed/unsigned warnings. 2012-05-15 13:45:49 +02:00
Antoine Pitrou
758153badb Fix refleaks introduced by 83da67651687. 2012-05-12 15:51:51 +02:00