Commit graph

405 commits

Author SHA1 Message Date
Victor Stinner
c3d2bc19e4 Use _PyBytesWriter in _PyBytes_FromIterator() 2015-10-14 14:15:49 +02:00
Victor Stinner
c5c3ba4bec Add _PyBytesWriter_Resize() function
This function gives a control to the buffer size without using min_size.
2015-10-14 13:56:47 +02:00
Victor Stinner
3c50ce39bf Factorize _PyBytes_FromList() and _PyBytes_FromTuple() code using a C macro 2015-10-14 13:50:40 +02:00
Victor Stinner
f2eafa323b Split PyBytes_FromObject() into subfunctions 2015-10-14 13:44:29 +02:00
Victor Stinner
2ec8063cc9 Modify _PyBytes_DecodeEscapeRecode() to use _PyBytesAPI
* Don't overallocate by 400% when recode is needed: only overallocate on demand
  using _PyBytesWriter.
* Use _PyLong_DigitValue to convert hexadecimal digit to int
* Create _PyBytes_DecodeEscapeRecode() subfunction
2015-10-14 13:32:13 +02:00
Victor Stinner
f6358a7e4c _PyBytesWriter_Alloc(): only use 10 bytes of the small buffer in debug mode to
enhance code to detect buffer under- and overflow.
2015-10-14 12:02:39 +02:00
Victor Stinner
2bf8993db9 Optimize bytes.fromhex() and bytearray.fromhex()
Issue #25401: Optimize bytes.fromhex() and bytearray.fromhex(): they are now
between 2x and 3.5x faster. Changes:

* Use a fast-path working on a char* string for ASCII string
* Use a slow-path for non-ASCII string
* Replace slow hex_digit_to_int() function with a O(1) lookup in
  _PyLong_DigitValue precomputed table
* Use _PyBytesWriter API to handle the buffer
* Add unit tests to check the error position in error messages
2015-10-14 11:25:33 +02:00
Victor Stinner
772b2b09f2 Optimize bytearray % args
Issue #25399: Don't create temporary bytes objects: modify _PyBytes_Format() to
create work directly on bytearray objects.

* Rename _PyBytes_Format() to _PyBytes_FormatEx() just in case if something
  outside CPython uses it
* _PyBytes_FormatEx() now uses (char*, Py_ssize_t) for the input string, so
  bytearray_format() doesn't need tot create a temporary input bytes object
* Add use_bytearray parameter to _PyBytes_FormatEx() which is passed to
  _PyBytesWriter, to create a bytearray buffer instead of a bytes buffer

Most formatting operations are now between 2.5 and 5 times faster.
2015-10-14 09:56:53 +02:00
Victor Stinner
661aaccf9d Add use_bytearray attribute to _PyBytesWriter
Issue #25399: Add a new use_bytearray attribute to _PyBytesWriter to use a
bytearray buffer, instead of using a bytes object.
2015-10-14 09:41:48 +02:00
Victor Stinner
03dab786b2 Rewrite PyBytes_FromFormatV() using _PyBytesWriter API
* Add much more unit tests on PyBytes_FromFormatV()
* Remove the first loop to compute the length of the output string
* Use _PyBytesWriter to handle the bytes buffer, use overallocation
* Cleanup the code to make simpler and easier to review
2015-10-14 00:21:35 +02:00
Victor Stinner
e9aa5950bb Fix compilation error in _PyBytesWriter_WriteBytes() on Windows 2015-10-12 13:57:47 +02:00
Victor Stinner
6c2cdae9e6 Writer APIs: use empty string singletons
Modify _PyBytesWriter_Finish() and _PyUnicodeWriter_Finish() to return the
empty bytes/Unicode string if the string is empty.
2015-10-12 13:29:43 +02:00
Victor Stinner
c29e29bed1 Relax _PyBytesWriter API
Don't require _PyBytesWriter pointer to be a "char *". Same change for
_PyBytesWriter_WriteBytes() parameter.

For example, binascii uses "unsigned char*".
2015-10-12 13:12:54 +02:00
Victor Stinner
0cdad1e2bc Issue #25349: Add fast path for b'%c' % int
Optimize also %% formater.
2015-10-09 22:50:36 +02:00
Victor Stinner
be75b8cf23 Issue #25349: Optimize bytes % int
Optimize bytes.__mod__(args) for integere formats: %d (%i, %u), %o, %x and %X.
_PyBytesWriter is now used to format directly the integer into the writer
buffer, instead of using a temporary bytes object.

Formatting is between 30% and 50% faster on a microbenchmark.
2015-10-09 22:43:24 +02:00
Victor Stinner
ce179bf6ba Add _PyBytesWriter_WriteBytes() to factorize the code 2015-10-09 12:57:22 +02:00
Victor Stinner
ad7715891e _PyBytesWriter: simplify code to avoid "prealloc" parameters
Substract preallocate bytes from min_size before calling
_PyBytesWriter_Prepare().
2015-10-09 12:38:53 +02:00
Victor Stinner
53926a1ce2 _PyBytesWriter: rename size attribute to min_size 2015-10-09 12:37:03 +02:00
Victor Stinner
fa7762ec06 Issue #25349: Optimize bytes % args using the new private _PyBytesWriter API
* Thanks to the _PyBytesWriter API, output smaller than 512 bytes are allocated
  on the stack and so avoid calling _PyBytes_Resize(). Because of that, change
  the default buffer size to fmtcnt instead of fmtcnt+100.
* Rely on _PyBytesWriter algorithm to overallocate the buffer instead of using
  a custom code. For example, _PyBytesWriter uses a different overallocation
  factor (25% or 50%) depending on the platform to get best performances.
* Disable overallocation for the last write.
* Replace C loops to fill characters with memset()
* Add also many comments to _PyBytes_Format()
* Remove unused FORMATBUFLEN constant
* Avoid the creation of a temporary bytes object when formatting a floating
  point number (when no custom formatting option is used)
* Fix also reference leaks on error handling
* Use Py_MEMCPY() to copy bytes between two formatters (%)
2015-10-09 11:48:06 +02:00
Victor Stinner
b3653a3458 Issue #25318: cleanup code _PyBytesWriter
Rename "stack buffer" to "small buffer".

Add also an assertion in _PyBytesWriter_GetPos().
2015-10-09 03:38:24 +02:00
Victor Stinner
b13b97d3b8 Issue #25318: Fix compilation error
Replace "#if Py_DEBUG" with "#ifdef Py_DEBUG".
2015-10-09 02:52:16 +02:00
Victor Stinner
0016507c16 Issue #25318: Move _PyBytesWriter to bytesobject.c
Declare also the private API in bytesobject.h.
2015-10-09 01:53:21 +02:00
Serhiy Storchaka
d92d4efe3d Issue #23573: Restored optimization of bytes.rfind() and bytearray.rfind()
for single-byte argument on Linux.
2015-07-20 22:58:02 +03:00
Serhiy Storchaka
ac5569b1fa Issue #24115: Update uses of PyObject_IsTrue(), PyObject_Not(),
PyObject_IsInstance(), PyObject_RichCompareBool() and _PyDict_Contains()
to check for and handle errors correctly.
2015-05-30 17:48:19 +03:00
Serhiy Storchaka
fa494fd883 Issue #24115: Update uses of PyObject_IsTrue(), PyObject_Not(),
PyObject_IsInstance(), PyObject_RichCompareBool() and _PyDict_Contains()
to check for and handle errors correctly.
2015-05-30 17:45:22 +03:00
Serhiy Storchaka
8b2e8b6cce Specify default values of semantic booleans in Argument Clinic generated signatures as booleans. 2015-05-30 11:30:39 +03:00
Gregory P. Smith
8cb6569fe1 Implements issue #9951: Adds a hex() method to bytes, bytearray, & memoryview.
Also updates a few internal implementations of the same thing to use the
new built-in code.

Contributed by Arnon Yaari.
2015-04-25 23:22:26 +00:00
Christian Heimes
4e25913f9f Remove local dead code. In both blocks dir is always greater 0. 2015-04-18 05:54:02 +02:00
Larry Hastings
89964c48d1 Issue #23944: Argument Clinic now wraps long impl prototypes at column 78. 2015-04-14 18:07:59 -04:00
Serhiy Storchaka
1009bf18b3 Issue #23501: Argumen Clinic now generates code into separate files by default. 2015-04-03 23:53:51 +03:00
Serhiy Storchaka
41525e31a5 Issue #23466: Raised OverflowError if %c argument is out of range. 2015-04-03 20:53:46 +03:00
Serhiy Storchaka
2c7b5a9d0d Issue #23466: %c, %o, %x, and %X in bytes formatting now raise TypeError on
non-integer input.
2015-03-30 09:19:08 +03:00
Victor Stinner
dabbfe7b30 Issue #23573: Fix bytes.rfind() and bytearray.rfind() on Windows
Windows has no memrchr() function.

This change is only a workaround, the optimization must be reenabled on other
platforms.
2015-03-25 03:16:32 +01:00
Serhiy Storchaka
d9d769fcdd Issue #23573: Increased performance of string search operations (str.find,
str.index, str.count, the in operator, str.split, str.partition) with
arguments of different kinds (UCS1, UCS2, UCS4).
2015-03-24 21:55:47 +02:00
Serhiy Storchaka
1dd49824df Issue #23681: The -b option now affects comparisons of bytes with int. 2015-03-20 16:54:57 +02:00
Ethan Furman
62e977f1b6 Close issue23467: add %r compatibility to bytes and bytearray 2015-03-11 08:17:00 -07:00
Antoine Pitrou
63afdaa110 Issue #23629: Fix the default __sizeof__ implementation for variable-sized objects. 2015-03-10 22:35:24 +01:00
Antoine Pitrou
a654510150 Issue #23629: Fix the default __sizeof__ implementation for variable-sized objects. 2015-03-10 22:32:00 +01:00
Serhiy Storchaka
26861b0b29 Issue #23450: Fixed possible integer overflows. 2015-02-16 20:52:17 +02:00
Serhiy Storchaka
ea5ce5a15e Issue #23383: Cleaned up bytes formatting. 2015-02-10 23:23:12 +02:00
Serhiy Storchaka
83848704f5 Issue #22896: Fixed using _getbuffer() in recently added _PyBytes_Format(). 2015-02-03 01:49:18 +02:00
Serhiy Storchaka
3dd3e26680 Issue #22896: Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer()
and PyObject_AsWriteBuffer().
2015-02-03 01:25:42 +02:00
Serhiy Storchaka
4fdb68491e Issue #22896: Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer()
and PyObject_AsWriteBuffer().
2015-02-03 01:21:08 +02:00
Victor Stinner
5474d0ba19 Issue #20284: Fix a compilation warning on Windows
Explicitly cast the long to char.
2015-01-26 16:43:36 +01:00
Benjamin Peterson
a8efc9601d ensure ilen is initialized when it is assigned to len 2015-01-26 09:23:41 -05:00
Ethan Furman
b95b56150f Issue20284: Implement PEP461 2015-01-23 20:05:18 -08:00
Serhiy Storchaka
83cf99d733 Issue #20335: bytes constructor now raises TypeError when encoding or errors
is specified with non-string argument.  Based on patch by Renaud Blanch.
2014-12-02 09:24:06 +02:00
Serhiy Storchaka
0b2cacb42a Issue #20335: bytes constructor now raises TypeError when encoding or errors
is specified with non-string argument.  Based on patch by Renaud Blanch.
2014-12-02 09:26:14 +02:00
Larry Hastings
dfbeb160de Issue #22615: Argument Clinic now supports the "type" argument for the
int converter.  This permits using the int converter with enums and
typedefs.
2014-10-13 10:39:41 +01:00
R David Murray
861470c836 #16518: Bring error messages in harmony with docs ("bytes-like object")
Some time ago we changed the docs to consistently use the term 'bytes-like
object' in all the contexts where bytes, bytearray, memoryview, etc are used.
This patch (by Ezio Melotti) completes that work by changing the error
messages that previously reported that certain types did "not support the
buffer interface" to instead say that a bytes-like object is required.  (The
glossary entry for bytes-like object references the discussion of the buffer
protocol in the docs.)
2014-10-05 11:47:01 -04:00