Write a function to enable more optimizations:
* If the substring is the whole string and overallocation is disabled, just
keep a reference to the string, don't copy characters
* Avoid a call to the expensive _PyUnicode_FindMaxChar() function when
possible
"honor direction and do a forward or backwards search": the runtime speed may
be different, but I consider that it doesn't really matter in practice. The
direction was never honored before: Python 2.7 uses memcmp() for the str type
for example.
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent. We could work to get rid of the -fwrapv requirement
in 3.4 but that requires more planning.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
NOTE: This change is smaller compared to 3.2 as much of this cleanup had
already been done. I added the comment that my change in 3.2 added so that the
code would match up. Otherwise this just adds or synchronizes appropriate UL
designations on some constants to be pedantic.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
computation as the overflow behavior of signed integers is undefined.
In practice we require compiling everything with -fwrapv which forces overflow
to be defined as twos compliment but this keeps the code cleaner for checkers
or in the case where someone has compiled it without -fwrapv or their
compiler's equivalent.
Found by Clang trunk's Undefined Behavior Sanitizer (UBSan).
Cleanup only - no functionality or hash values change.
* Remove micro-optization:
(errors == "surrogateescape" || strcmp(errors, "surrogateescape") == 0).
Only use strcmp()
* Initialize 'arg' members in unicode_format_arg() to help the compiler to
diagnose real bugs and also make the code simpler to read
ASCII/surrogateescape codec is now used, instead of the locale encoding, to
decode the command line arguments. This change fixes inconsistencies with
os.fsencode() and os.fsdecode() because these operating systems announces an
ASCII locale encoding, whereas the ISO-8859-1 encoding is used in practice.