Commit graph

4178 commits

Author SHA1 Message Date
Antoine Pitrou
f0b934b01a Reuse the stringlib in findchar(), and make its signature more convenient 2011-10-13 18:55:09 +02:00
Antoine Pitrou
c198d0599b Add a comment explaining this heuristic. 2011-10-13 18:07:37 +02:00
Antoine Pitrou
dda339e6d2 Simplify heuristic for when to use memchr 2011-10-13 17:58:11 +02:00
Victor Stinner
55c991197b Optimize unicode_subscript() for step != 1 and ascii strings 2011-10-13 01:17:06 +02:00
Victor Stinner
127226ba69 Don't use PyUnicode_MAX_CHAR_VALUE() macro in Py_MAX() 2011-10-13 01:12:34 +02:00
Victor Stinner
9e7a1bcfd6 Optimize findchar() for PyUnicode_1BYTE_KIND: use memchr and memrchr 2011-10-13 00:18:12 +02:00
Antoine Pitrou
dd4e2f0153 Issue #13155: Optimize finding the optimal character width of an unicode string 2011-10-13 00:02:27 +02:00
Victor Stinner
49a0a21f37 Unicode replace() avoids calling unicode_adjust_maxchar() when it's useless
Add also a special case if the result is an empty string.
2011-10-12 23:46:10 +02:00
Antoine Pitrou
6b4883dec0 PEP 3151 / issue #12555: reworking the OS and IO exception hierarchy. 2011-10-12 02:54:14 +02:00
Victor Stinner
983b1434bd Backed out changeset 952d91a7d376
If maxchar == PyUnicode_MAX_CHAR_VALUE(unicode), we do an useless copy.
2011-10-12 00:54:35 +02:00
Antoine Pitrou
e55ad2dff0 Relax condition 2011-10-12 00:36:51 +02:00
Victor Stinner
d218bf14cc stringlib: Fix STRINGLIB_STR for UCS2/UCS4 2011-10-12 00:14:32 +02:00
Victor Stinner
4e10100dee Fix compiler warning in _PyUnicode_FromUCS2() 2011-10-11 23:27:52 +02:00
Victor Stinner
8cc70dcf70 Fix fastsearch for UCS2 and UCS4
* If needle is 0, try (p[0] >> 16) & 0xff for UCS4
 * Disable fastsearch_memchr_1char() if needle is zero for UCS2 and UCS4
2011-10-11 23:22:22 +02:00
Antoine Pitrou
950468e553 Use _PyUnicode_CONVERT_BYTES() where applicable. 2011-10-11 22:45:48 +02:00
Victor Stinner
577db2c9f0 PyUnicode_AsUnicodeCopy() now checks if PyUnicode_AsUnicode() failed 2011-10-11 22:12:48 +02:00
Victor Stinner
c4f281eba3 Fix misuse of PyUnicode_GET_SIZE, use PyUnicode_GET_LENGTH instead 2011-10-11 22:11:42 +02:00
Victor Stinner
ed2682be2f Reuse PyUnicode_Copy() in validate_and_copy_tuple() 2011-10-11 21:53:24 +02:00
Antoine Pitrou
e459a0877e Issue #13136: speed up conversion between different character widths. 2011-10-11 20:58:41 +02:00
Antoine Pitrou
2c3b2302ad Issue #13134: optimize finding single-character strings using memchr 2011-10-11 20:29:21 +02:00
Antoine Pitrou
2871698546 /* Remove unused code. It has been committed out since 2000 (!). */ 2011-10-11 03:17:47 +02:00
Antoine Pitrou
53bb548f22 Avoid exporting private helpers
(thanks "make smelly")
2011-10-10 23:49:24 +02:00
Martin v. Löwis
1ee1b6fe0d Use identifier API for PyObject_GetAttrString. 2011-10-10 18:11:30 +02:00
Victor Stinner
794d567b17 any_find_slice() doesn't use callbacks anymore
* Call directly the right find/rfind method: allow inlining functions
 * Remove Py_LOCAL_CALLBACK (added for any_find_slice)
2011-10-10 03:21:36 +02:00
Martin v. Löwis
afe55bba33 Add API for static strings, primarily good for identifiers.
Thanks to Konrad Schöbel and Jasper Schulz for helping with the mass-editing.
2011-10-09 10:38:36 +02:00
Antoine Pitrou
eaf139b3fc Fix typo in the PyUnicode_Find() implementation 2011-10-09 00:33:09 +02:00
Georg Brandl
388349add2 Closes #12192: Document that mutating list methods do not return the instance (original patch by Mike Hoy). 2011-10-08 18:32:40 +02:00
Martin v. Löwis
c47adb04b3 Change PyUnicode_KIND to 1,2,4. Drop _KIND_SIZE and _CHARACTER_SIZE. 2011-10-07 20:55:35 +02:00
Victor Stinner
dd07732af5 PyUnicode_Join() calls directly memcpy() if all strings are of the same kind 2011-10-07 17:02:31 +02:00
Antoine Pitrou
978b9d2a27 Fix formatting memory consumption with very large padding specifications 2011-10-07 12:35:48 +02:00
Victor Stinner
59de0ee9e0 str.replace(a, a) is now returning str unchanged if a is a 2011-10-07 10:01:28 +02:00
Antoine Pitrou
4574e62c6e Fix massive slowdown in string formatting with str.format.
Example:
./python -m timeit -s "f='{}' + '-' * 1024 + '{}'; s='abcd' * 16384" "f.format(s, s)"

-> before: 547 usec per loop
-> after: 13 usec per loop
-> 3.2: 22.5 usec per loop
-> 2.7: 12.6 usec per loop
2011-10-07 02:26:47 +02:00
Antoine Pitrou
5c0ba36d5f Fix massive slowdown in string formatting with the % operator 2011-10-07 01:54:09 +02:00
Antoine Pitrou
7c46da7993 Ensure that 1-char singletons get used 2011-10-06 22:07:51 +02:00
Antoine Pitrou
c61c8d7a5e Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists.
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 19:04:12 +02:00
Antoine Pitrou
eeb7eea1f9 Issue #12911: Fix memory consumption when calculating the repr() of huge tuples or lists.
This introduces a small private API for this common pattern.
The issue has been discovered thanks to Martin's huge-mem buildbot.
2011-10-06 18:57:27 +02:00
Victor Stinner
c6f0df7b20 Fix PyUnicode_Join() for len==1 and non-exact string 2011-10-06 15:58:54 +02:00
Antoine Pitrou
dbf697ae5c Fix compilation warnings under 64-bit Windows 2011-10-06 15:34:41 +02:00
Antoine Pitrou
15a66cf134 Fix compilation under Windows 2011-10-06 15:25:32 +02:00
Victor Stinner
200f21340d Fix assertion in unicode_adjust_maxchar() 2011-10-06 13:27:56 +02:00
Victor Stinner
acf47b807f Fix my last change on PyUnicode_Join(): don't process separator if len==1 2011-10-06 12:32:37 +02:00
Victor Stinner
25a4b29c95 str.replace() avoids memory when it's possible 2011-10-06 12:31:55 +02:00
Victor Stinner
56c161ab00 _copy_characters() fails more quickly in debug mode on inconsistent state 2011-10-06 02:47:11 +02:00
Victor Stinner
c729b8e92f Fix a compiler warning: don't define unicode_is_singleton() in release mode 2011-10-06 02:36:59 +02:00
Victor Stinner
fb9ea8c57e Don't check for the maximum character when copying from unicodeobject.c
* Create copy_characters() function which doesn't check for the maximum
   character in release mode
 * _PyUnicode_CheckConsistency() is no more static to be able to use it
   in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
 * _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Victor Stinner
05d1189566 Fix post-condition in unicode_repr(): check the result, not the input 2011-10-06 01:13:58 +02:00
Victor Stinner
f48323e3b3 replace() uses unicode_fromascii() if the input and replace string is ASCII 2011-10-05 23:27:08 +02:00
Victor Stinner
0617b6e18b unicode_fromascii() checks that the input is ASCII in debug mode 2011-10-05 23:26:01 +02:00
Victor Stinner
c3cec7868b Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
14f8f02826 Fix PyUnicode_Partition(): str_in->str_obj 2011-10-05 20:58:25 +02:00