Victor Stinner
1d4b35f4e5
rephrase PyUnicode_1BYTE_KIND documentation
2011-10-06 01:51:19 +02:00
Victor Stinner
fb9ea8c57e
Don't check for the maximum character when copying from unicodeobject.c
...
* Create copy_characters() function which doesn't check for the maximum
character in release mode
* _PyUnicode_CheckConsistency() is no more static to be able to use it
in _PyUnicode_FormatAdvanced() (in formatter_unicode.c)
* _PyUnicode_CheckConsistency() checks the string hash
2011-10-06 01:45:57 +02:00
Éric Araujo
80a348c0a0
Fix typo
2011-10-05 01:11:12 +02:00
Victor Stinner
30134f53fc
Complete documentation of compact ASCII strings
2011-10-04 01:32:45 +02:00
Victor Stinner
a41463c203
Document utf8_length and wstr_length states
...
Ensure these states with assertions in _PyUnicode_CheckConsistency().
2011-10-04 01:05:08 +02:00
Victor Stinner
7f11ad4594
Unicode: document when the wstr pointer is shared with data
...
Add also related assertions to _PyUnicode_CheckConsistency().
2011-10-04 00:00:20 +02:00
Victor Stinner
8cfcbed4e3
Improve string forms and PyUnicode_Resize() documentation
...
Remove also the FIXME for resize_copy(): as discussed with Martin, copy the
string on resize if the string is not resizable is just fine.
2011-10-03 23:19:21 +02:00
Victor Stinner
c3cec7868b
Add asciilib: similar to ucs1, ucs2 and ucs4 library, but specialized to ASCII
...
ucs1, ucs2 and ucs4 libraries have to scan created substring to find the
maximum character, whereas it is not need to ASCII strings. Because ASCII
strings are common, it is useful to optimize ASCII.
2011-10-05 21:24:08 +02:00
Victor Stinner
4d0d54bcba
Document requierements of Unicode kinds
2011-10-05 01:31:05 +02:00
Georg Brandl
07de325672
More fixes.
2011-10-05 16:47:38 +02:00
Georg Brandl
c6bc4c6897
Fix a few typos in the unicode header.
2011-10-05 16:23:09 +02:00
Georg Brandl
4975a9b44d
Fix grammar.
2011-10-05 16:12:21 +02:00
Victor Stinner
b9275c104e
Speedup str[a:b] and PyUnicode_FromKindAndData
...
* str[a:b] doesn't scan the string for the maximum character if the string
is ascii only
* PyUnicode_FromKindAndData() stops if we are sure that we cannot use a
shorter character type. For example, _PyUnicode_FromUCS1() stops if we
have at least one character in range U+0080-U+00FF
2011-10-05 14:01:42 +02:00
Victor Stinner
85041a54bd
_PyUnicode_CheckConsistency() checks utf8 field consistency
2011-10-03 14:42:39 +02:00
Victor Stinner
a3b334da6d
PyUnicode_Ready() now sets ascii=1 if maxchar < 128
...
ascii=1 is no more reserved to PyASCIIObject. Use
PyUnicode_IS_COMPACT_ASCII(obj) to check if obj is a PyASCIIObject (as before).
2011-10-03 13:53:37 +02:00
Victor Stinner
910337b42e
Add _PyUnicode_CheckConsistency() macro to help debugging
...
* Document Unicode string states
* Use _PyUnicode_CheckConsistency() to ensure that objects are always
consistent.
2011-10-03 03:20:16 +02:00
Victor Stinner
37943769ef
PyUnicode_READ_CHAR() ensures that the string is ready
2011-10-02 20:33:18 +02:00
Victor Stinner
7a48ff7e06
Use Py_UCS1 instead of unsigned char in unicodeobject.h
2011-10-02 00:55:25 +02:00
Victor Stinner
cd9950fd09
PyUnicode_WriteChar() raises IndexError on invalid index
...
PyUnicode_WriteChar() raises also a ValueError if the string has more than 1
reference.
2011-10-02 00:34:53 +02:00
Victor Stinner
9f789e7f63
_PyUnicode_AsKind() is *not* part of the stable ABI
2011-10-01 03:57:28 +02:00
Victor Stinner
4584a5ba1a
PyUnicode_CHARACTER_SIZE(): add a reference to PyUnicode_KIND_SIZE()
2011-10-01 02:39:37 +02:00
Victor Stinner
034f6cf10c
Add PyUnicode_Copy() function, include it to the public API
2011-09-30 02:26:44 +02:00
Victor Stinner
d8f6510acc
_PyUnicode_Ready() cannot be used on ready strings anymore
...
* Change its prototype: PyObject* instead of PyUnicodeoObject*.
* Remove an old assertion, the result of PyUnicode_READY (_PyUnicode_Ready)
must be checked instead
2011-09-29 19:43:17 +02:00
Victor Stinner
bc8b81bc4e
Move _PyUnicode_UTF8() and _PyUnicode_UTF8_LENGTH() outside unicodeobject.h
...
Move these macros to unicodeobject.c
2011-09-29 19:31:34 +02:00
Victor Stinner
a0702ab1fe
Add a note in PyUnicode_CopyCharacters() doc: it doesn't write null character
...
Cleanup also the code (avoid the goto).
2011-09-29 14:14:38 +02:00
Victor Stinner
f5ca1a21a5
PyUnicode_CopyCharacters() fails if 'to' has more than 1 reference
2011-09-28 23:54:59 +02:00
Victor Stinner
17222160e7
Mark _PyUnicode_FindMaxCharAndNumSurrogatePairs() as private
2011-09-28 22:15:37 +02:00
Victor Stinner
157f83fcfc
Strip trailing spaces in unicodeobject.[ch]
2011-09-28 21:41:31 +02:00
Victor Stinner
be78eaf2de
PyUnicode_CopyCharacters() checks for buffer and character overflow
...
It now returns the number of written characters on success.
2011-09-28 21:37:03 +02:00
Victor Stinner
fb5f5f2420
Mark PyUnicode_CONVERT_BYTES as private
2011-09-28 21:39:49 +02:00
Victor Stinner
5ce1b0dbc0
Set Py_UNICODE_REPLACEMENT_CHARACTER type to Py_UCS4, instead of Py_UNICODE
2011-09-28 20:29:27 +02:00
Martin v. Löwis
d63a3b8beb
Implement PEP 393.
2011-09-28 07:41:54 +02:00
Victor Stinner
f955eb210f
Merge 3.2: Fix PyUnicode_AsWideCharString() doc
...
- Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null
character
- Fix spelling of the null character
2011-09-06 02:01:29 +02:00
Victor Stinner
d88d9836c5
Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character
...
Fix also spelling of the null character.
2011-09-06 02:00:05 +02:00
Ezio Melotti
8c9375bb59
#10542 : Add 4 macros to work with surrogates: Py_UNICODE_IS_SURROGATE, Py_UNICODE_IS_HIGH_SURROGATE, Py_UNICODE_IS_LOW_SURROGATE, Py_UNICODE_JOIN_SURROGATES.
2011-08-22 20:03:25 +03:00
Victor Stinner
99b9538636
Issue #9642 : Uniformize the tests on the availability of the mbcs codec
...
Add a new HAVE_MBCS define.
2011-07-04 14:23:54 +02:00
Victor Stinner
f3fd733f92
Remove useless argument of _PyUnicode_AsDefaultEncodedString()
2011-03-02 01:03:11 +00:00
Victor Stinner
0d711169fa
Issue #9738 : Ooops, fix typos in my previous commit (r87506)
2010-12-27 02:39:20 +00:00
Victor Stinner
dc2081f72b
Issue #9738 : document encodings of unicode functions
2010-12-27 01:49:29 +00:00
Georg Brandl
b550308597
Take PyUnicode_TransformDecimalToASCII out of the limited API.
2010-12-05 11:40:48 +00:00
Alexander Belopolsky
942af5a9a4
Issue #10557 : Fixed error messages from float() and other numeric
...
types. Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis
4d0d471a80
Merge branches/pep-0384.
2010-12-03 20:14:31 +00:00
Alexander Belopolsky
83283c270a
Issue #10413 : Updated comments to reflect code changes
2010-11-16 14:29:01 +00:00
Victor Stinner
09f24bb408
Issue #8761 : Mangle PyUnicode_CompareWithASCIIString function name for
...
narrow/wide unicode build.
2010-10-24 20:38:25 +00:00
Benjamin Peterson
8f67d0893f
make hashes always the size of pointers; introduce Py_hash_t #9778
2010-10-17 20:54:53 +00:00
Victor Stinner
f3170ccef8
Use locale encoding if Py_FileSystemDefaultEncoding is not set
...
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
Py_FileSystemDefaultEncoding is NULL
* redecode_filenames() functions and _Py_code_object_list (issue #9630 )
are no more needed: remove them
2010-10-15 12:04:23 +00:00
Victor Stinner
beb4135b8c
PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
...
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner
137c34c027
Issue #9979 : Create function PyUnicode_AsWideCharString().
2010-09-29 10:25:54 +00:00
Amaury Forgeot d'Arc
feb7307db4
#9210 : remove --with-wctype-functions configure option.
...
The internal unicode database is now always used.
(after 5 years: see
http://mail.python.org/pipermail/python-dev/2004-December/050193.html
)
2010-09-12 22:42:57 +00:00
Victor Stinner
1205f2774e
Issue #9738 : PyUnicode_FromFormat() and PyErr_Format() raise an error on
...
a non-ASCII byte in the format string.
Document also the encoding.
2010-09-11 00:54:47 +00:00