Commit graph

284 commits

Author SHA1 Message Date
Victor Stinner
5ce1b0dbc0 Set Py_UNICODE_REPLACEMENT_CHARACTER type to Py_UCS4, instead of Py_UNICODE 2011-09-28 20:29:27 +02:00
Martin v. Löwis
d63a3b8beb Implement PEP 393. 2011-09-28 07:41:54 +02:00
Victor Stinner
f955eb210f Merge 3.2: Fix PyUnicode_AsWideCharString() doc
- Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null
   character
 - Fix spelling of the null character
2011-09-06 02:01:29 +02:00
Victor Stinner
d88d9836c5 Fix PyUnicode_AsWideCharString() doc: size doesn't contain the null character
Fix also spelling of the null character.
2011-09-06 02:00:05 +02:00
Ezio Melotti
8c9375bb59 #10542: Add 4 macros to work with surrogates: Py_UNICODE_IS_SURROGATE, Py_UNICODE_IS_HIGH_SURROGATE, Py_UNICODE_IS_LOW_SURROGATE, Py_UNICODE_JOIN_SURROGATES. 2011-08-22 20:03:25 +03:00
Victor Stinner
99b9538636 Issue #9642: Uniformize the tests on the availability of the mbcs codec
Add a new HAVE_MBCS define.
2011-07-04 14:23:54 +02:00
Victor Stinner
f3fd733f92 Remove useless argument of _PyUnicode_AsDefaultEncodedString() 2011-03-02 01:03:11 +00:00
Victor Stinner
0d711169fa Issue #9738: Ooops, fix typos in my previous commit (r87506) 2010-12-27 02:39:20 +00:00
Victor Stinner
dc2081f72b Issue #9738: document encodings of unicode functions 2010-12-27 01:49:29 +00:00
Georg Brandl
b550308597 Take PyUnicode_TransformDecimalToASCII out of the limited API. 2010-12-05 11:40:48 +00:00
Alexander Belopolsky
942af5a9a4 Issue #10557: Fixed error messages from float() and other numeric
types.  Added a new API function, PyUnicode_TransformDecimalToASCII(),
which transforms non-ASCII decimal digits in a Unicode string to their
ASCII equivalents.
2010-12-04 03:38:46 +00:00
Martin v. Löwis
4d0d471a80 Merge branches/pep-0384. 2010-12-03 20:14:31 +00:00
Alexander Belopolsky
83283c270a Issue #10413: Updated comments to reflect code changes 2010-11-16 14:29:01 +00:00
Victor Stinner
09f24bb408 Issue #8761: Mangle PyUnicode_CompareWithASCIIString function name for
narrow/wide unicode build.
2010-10-24 20:38:25 +00:00
Benjamin Peterson
8f67d0893f make hashes always the size of pointers; introduce Py_hash_t #9778 2010-10-17 20:54:53 +00:00
Victor Stinner
f3170ccef8 Use locale encoding if Py_FileSystemDefaultEncoding is not set
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
   Py_FileSystemDefaultEncoding is NULL
 * redecode_filenames() functions and _Py_code_object_list (issue #9630)
   are no more needed: remove them
2010-10-15 12:04:23 +00:00
Victor Stinner
beb4135b8c PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject*
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().
2010-10-07 01:02:42 +00:00
Victor Stinner
137c34c027 Issue #9979: Create function PyUnicode_AsWideCharString(). 2010-09-29 10:25:54 +00:00
Amaury Forgeot d'Arc
feb7307db4 #9210: remove --with-wctype-functions configure option.
The internal unicode database is now always used.

(after 5 years: see
  http://mail.python.org/pipermail/python-dev/2004-December/050193.html
)
2010-09-12 22:42:57 +00:00
Victor Stinner
1205f2774e Issue #9738: PyUnicode_FromFormat() and PyErr_Format() raise an error on
a non-ASCII byte in the format string.

Document also the encoding.
2010-09-11 00:54:47 +00:00
Victor Stinner
46408606d8 Rename PyUnicode_strdup() to PyUnicode_AsUnicodeCopy() 2010-09-03 16:18:00 +00:00
Victor Stinner
71133ff368 Create PyUnicode_strdup() function 2010-09-01 23:43:53 +00:00
Victor Stinner
c4eb765fc1 Create Py_UNICODE_strcat() function 2010-09-01 23:43:50 +00:00
Antoine Pitrou
fce7fd6426 Issue #9549: sys.setdefaultencoding() and PyUnicode_SetDefaultEncoding()
are now removed, since their effect was inexistent in 3.x (the default
encoding is hardcoded to utf-8 and cannot be changed).
2010-09-01 18:54:56 +00:00
Amaury Forgeot d'Arc
324ac65ceb #5127: Even on narrow unicode builds, the C functions that access the Unicode
Database (Py_UNICODE_TOLOWER, Py_UNICODE_ISDECIMAL, and others) now accept
and return characters from the full Unicode range (Py_UCS4).

The differences from Python code are few:
- unicodedata.numeric(), unicodedata.decimal() and unicodedata.digit()
  now return the correct value for large code points
- repr() may consider more characters as printable.
2010-08-18 20:44:58 +00:00
Victor Stinner
ef8d95c498 Issue #9425: Create Py_UNICODE_strncmp() function
The code is based on strncmp() of the libiberty library,
function in the public domain.
2010-08-16 22:03:11 +00:00
Victor Stinner
47fcb5b4c3 Issue #9542: Create PyUnicode_FSDecoder() function
It's a ParseTuple converter: decode bytes objects to unicode using
PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is.

 * Don't specify surrogateescape error handler in the comments nor the
   documentation, but PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_EncodeFSDefault() because these functions use strict error handler
   for the mbcs encoding (on Windows).
 * Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid
   inconsistency with unicodeobject.h.
2010-08-13 23:59:58 +00:00
Victor Stinner
331ea92ade Issue #9425: create Py_UNICODE_strrchr() function 2010-08-10 16:37:20 +00:00
Georg Brandl
952867aa30 #9078: fix some Unicode C API descriptions, in comments and docs. 2010-06-27 10:17:12 +00:00
Benjamin Peterson
ccbd69437a rephrase 2010-05-15 17:43:18 +00:00
Victor Stinner
ae6265f8d0 Issue #8715: Create PyUnicode_EncodeFSDefault() function: Encode a Unicode
object to Py_FileSystemDefaultEncoding with the "surrogateescape" error
handler, return a bytes object. If Py_FileSystemDefaultEncoding is not set,
fall back to UTF-8.
2010-05-15 16:27:27 +00:00
Victor Stinner
77c3862417 Issue #8711: Document PyUnicode_DecodeFSDefault*() functions
* Add paragraph titles to c-api/unicode.rst.
 * Fix PyUnicode_DecodeFSDefault*() comment: it now uses the "surrogateescape"
   error handler (and not "replace")
 * Remove "The function is intended to be used for paths and file names only
   during bootstrapping process where the codecs are not set up." from
   PyUnicode_FSConverter() comment: it is used after the bootstrapping and for
   other purposes than file names
2010-05-14 15:58:55 +00:00
Antoine Pitrou
f95a1b3c53 Recorded merge of revisions 81029 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81029 | antoine.pitrou | 2010-05-09 16:46:46 +0200 (dim., 09 mai 2010) | 3 lines

  Untabify C files. Will watch buildbots.
........
2010-05-09 15:52:27 +00:00
Benjamin Peterson
ad465f904b alias PyUnicode_CompareWithASCII 2010-05-07 20:21:26 +00:00
Victor Stinner
dcb2403022 Issue #8485: PyUnicode_FSConverter() doesn't accept bytearray object anymore,
you have to convert your bytearray filenames to bytes
2010-04-22 12:08:36 +00:00
Martin v. Löwis
011e842033 Issue #5915: Implement PEP 383, Non-decodable Bytes in
System Character Interfaces.
2009-05-05 04:43:17 +00:00
Antoine Pitrou
244651aa2f Merged revisions 72283-72284 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r72283 | antoine.pitrou | 2009-05-04 20:32:32 +0200 (lun., 04 mai 2009) | 4 lines

  Issue #4426: The UTF-7 decoder was too strict and didn't accept some legal sequences.
  Patch by Nick Barnes and Victor Stinner.
........
  r72284 | antoine.pitrou | 2009-05-04 20:32:50 +0200 (lun., 04 mai 2009) | 3 lines

  Add Nick Barnes to ACKS.
........
2009-05-04 18:56:13 +00:00
Eric Smith
0923d1d8d7 The other half of Issue #1580: use short float repr where possible.
Addresses the float -> string conversion, using David Gay's code which
was added in Mark Dickinson's checkin r71663.

Also addresses these, which are intertwined with the short repr
changes:

- Issue #5772: format(1e100, '<') produces '1e+100', not '1.0e+100'
- Issue #5515: 'n' formatting with commas no longer works poorly
    with leading zeros.
- PEP 378 Format Specifier for Thousands Separator: implemented
    for floats.
2009-04-16 20:16:10 +00:00
Eric Smith
a3b1ac8dca Added ',' thousands grouping to int.__format__. See PEP 378.
This is incomplete, but I want to get some version into the next alpha. I am still working on:
Documentation.
More tests.
Implement for floats.

In addition, there's an existing bug with 'n' formatting that carries forward to thousands grouping (issue 5515).
2009-04-03 14:45:06 +00:00
Benjamin Peterson
960cf0fd9b Merged revisions 68167,68276,68292-68293,68344 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r68167 | vinay.sajip | 2009-01-02 12:53:04 -0600 (Fri, 02 Jan 2009) | 1 line

  Minor documentation changes relating to NullHandler, the module used for handlers and references to ConfigParser.
........
  r68276 | tarek.ziade | 2009-01-03 18:04:49 -0600 (Sat, 03 Jan 2009) | 1 line

  fixed #1702551: distutils sdist was not pruning VCS directories under win32
........
  r68292 | skip.montanaro | 2009-01-04 04:36:58 -0600 (Sun, 04 Jan 2009) | 3 lines

  If user configures --without-gcc give preference to $CC instead of blindly
  assuming the compiler will be "cc".
........
  r68293 | tarek.ziade | 2009-01-04 04:37:52 -0600 (Sun, 04 Jan 2009) | 1 line

  using clearer syntax
........
  r68344 | marc-andre.lemburg | 2009-01-05 13:43:35 -0600 (Mon, 05 Jan 2009) | 7 lines

  Fix #4846 (Py_UNICODE_ISSPACE causes linker error) by moving the declaration
  into the extern "C" section.

  Add a few more comments and apply some minor edits to make the file contents
  fit the original structure again.
........
2009-01-09 04:11:44 +00:00
Alexandre Vassalotti
15fafbe6f2 Merged revisions 67970-67971 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r67970 | alexandre.vassalotti | 2008-12-27 20:52:58 -0500 (Sat, 27 Dec 2008) | 2 lines

  Fix name mangling of PyUnicode_ClearFreeList.
........
  r67971 | alexandre.vassalotti | 2008-12-27 21:10:35 -0500 (Sat, 27 Dec 2008) | 2 lines

  Sort UCS-2/UCS-4 name mangling list.
........
2008-12-28 02:13:22 +00:00
Benjamin Peterson
206e3074d3 Merged revisions 66887,66891,66902-66903,66905-66906,66911-66913,66922,66927-66928,66936,66939-66940,66962,66964,66973 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

................
  r66887 | benjamin.peterson | 2008-10-13 16:51:40 -0500 (Mon, 13 Oct 2008) | 1 line

  document how to disable fixers
................
  r66891 | amaury.forgeotdarc | 2008-10-14 16:47:22 -0500 (Tue, 14 Oct 2008) | 5 lines

  #4122: On Windows, Py_UNICODE_ISSPACE cannot be used in an extension module:
  compilation fails with "undefined reference to _Py_ascii_whitespace"

  Will backport to 2.6.
................
  r66902 | skip.montanaro | 2008-10-15 06:49:10 -0500 (Wed, 15 Oct 2008) | 1 line

  easter egg
................
  r66903 | benjamin.peterson | 2008-10-15 15:34:09 -0500 (Wed, 15 Oct 2008) | 1 line

  don't recurse into directories that start with '.'
................
  r66905 | benjamin.peterson | 2008-10-15 16:05:55 -0500 (Wed, 15 Oct 2008) | 1 line

  support the optional line argument for idle
................
  r66906 | benjamin.peterson | 2008-10-15 16:58:46 -0500 (Wed, 15 Oct 2008) | 1 line

  add a much requested newline
................
  r66911 | benjamin.peterson | 2008-10-15 18:10:28 -0500 (Wed, 15 Oct 2008) | 41 lines

  Merged revisions 66805,66841,66860,66884-66886,66893,66907,66910 via svnmerge from
  svn+ssh://pythondev@svn.python.org/sandbox/trunk/2to3/lib2to3

  ........
    r66805 | benjamin.peterson | 2008-10-04 20:11:02 -0500 (Sat, 04 Oct 2008) | 1 line

    mention what the fixes directory is for
  ........
    r66841 | benjamin.peterson | 2008-10-07 17:48:12 -0500 (Tue, 07 Oct 2008) | 1 line

    use assertFalse and assertTrue
  ........
    r66860 | benjamin.peterson | 2008-10-08 16:05:07 -0500 (Wed, 08 Oct 2008) | 1 line

    instead of abusing the pattern matcher, use start_tree to find a next binding
  ........
    r66884 | benjamin.peterson | 2008-10-13 15:50:30 -0500 (Mon, 13 Oct 2008) | 1 line

    don't print tokens to stdout when -v is given
  ........
    r66885 | benjamin.peterson | 2008-10-13 16:28:57 -0500 (Mon, 13 Oct 2008) | 1 line

    add the -x option to disable fixers
  ........
    r66886 | benjamin.peterson | 2008-10-13 16:33:53 -0500 (Mon, 13 Oct 2008) | 1 line

    cut down on some crud
  ........
    r66893 | benjamin.peterson | 2008-10-14 17:16:54 -0500 (Tue, 14 Oct 2008) | 1 line

    add an optional set literal fixer
  ........
    r66907 | benjamin.peterson | 2008-10-15 16:59:41 -0500 (Wed, 15 Oct 2008) | 1 line

    don't write backup files by default
  ........
    r66910 | benjamin.peterson | 2008-10-15 17:43:10 -0500 (Wed, 15 Oct 2008) | 1 line

    add the -n option; it stops backupfiles from being written
  ........
................
  r66912 | hirokazu.yamamoto | 2008-10-16 01:25:25 -0500 (Thu, 16 Oct 2008) | 2 lines

  removed unused _PyUnicode_FromFileSystemEncodedObject.
  made win32_chdir, win32_wchdir static.
................
  r66913 | benjamin.peterson | 2008-10-16 13:52:14 -0500 (Thu, 16 Oct 2008) | 1 line

  document that deque indexing is O(n) #4123
................
  r66922 | benjamin.peterson | 2008-10-16 14:40:14 -0500 (Thu, 16 Oct 2008) | 1 line

  use new showwarnings signature for idle #3391
................
  r66927 | andrew.kuchling | 2008-10-16 15:15:47 -0500 (Thu, 16 Oct 2008) | 1 line

  Fix wording (2.6.1 backport candidate)
................
  r66928 | georg.brandl | 2008-10-16 15:20:56 -0500 (Thu, 16 Oct 2008) | 2 lines

  Add more TOC to the whatsnew index page.
................
  r66936 | georg.brandl | 2008-10-16 16:20:15 -0500 (Thu, 16 Oct 2008) | 2 lines

  #4131: FF3 doesn't write cookies.txt files.
................
  r66939 | georg.brandl | 2008-10-16 16:36:39 -0500 (Thu, 16 Oct 2008) | 2 lines

  part of #4012: kill off old name "processing".
................
  r66940 | georg.brandl | 2008-10-16 16:38:48 -0500 (Thu, 16 Oct 2008) | 2 lines

  #4083: add "as" to except handler grammar as per PEP 3110.
................
  r66962 | benjamin.peterson | 2008-10-17 15:01:01 -0500 (Fri, 17 Oct 2008) | 1 line

  clarify CALL_FUNCTION #4141
................
  r66964 | georg.brandl | 2008-10-17 16:41:49 -0500 (Fri, 17 Oct 2008) | 2 lines

  Fix duplicate word.
................
  r66973 | armin.ronacher | 2008-10-19 03:27:43 -0500 (Sun, 19 Oct 2008) | 3 lines

  Fixed #4067 by implementing _attributes and _fields for the AST root node.
................
2008-10-19 14:07:49 +00:00
Marc-André Lemburg
4cc0f24857 Rename PyUnicode_AsString -> _PyUnicode_AsString and
PyUnicode_AsStringAndSize -> _PyUnicode_AsStringAndSize to mark
them for interpreter internal use only.

We'll have to rework these APIs or create new ones for the
purpose of accessing the UTF-8 representation of Unicode objects
for 3.1.
2008-08-07 18:54:33 +00:00
Eric Smith
6d7e7a730e Merged revisions 64491 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

........
  r64491 | eric.smith | 2008-06-23 20:42:10 -0400 (Mon, 23 Jun 2008) | 1 line

  Modified interface to _Py_[String|Unicode]InsertThousandsGrouping, in anticipation of fixing issue 3140.
........
2008-06-24 01:06:47 +00:00
Georg Brandl
559e5d7f4d #2630: Implement PEP 3138.
The repr() of a string now contains printable Unicode characters unescaped.
The new ascii() builtin can be used to get a repr() with only ASCII characters in it.

PEP and patch were written by Atsuo Ishimoto.
2008-06-11 18:37:52 +00:00
Marc-André Lemburg
b2750b5d33 Move the codec decode type checks to bytes/bytearray.decode().
Use faster PyUnicode_FromEncodedObject() for bytes/bytearray.decode().

Add new PyCodec_KnownEncoding() API.

Add new PyUnicode_AsDecodedUnicode() and PyUnicode_AsEncodedUnicode() APIs.

Add missing PyUnicode_AsDecodedObject() to unicodeobject.h

Fix punicode codec to also work on memoryviews.
2008-06-06 12:18:17 +00:00
Georg Brandl
a26f8ca668 Revert r63934 -- it was mixing two patches. 2008-06-04 13:01:30 +00:00
Georg Brandl
f954c4b9fb Remove meaning of -ttt, but still accept -t option on cmdline for compatibility. 2008-06-04 11:41:32 +00:00
Eric Smith
4a7d76ddb5 Refactor and clean up str.format() code (and helpers) in advance of optimizations. 2008-05-30 18:10:19 +00:00
Eric Smith
5807c415c5 Merged revisions 63078 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk

When forward porting this, I added _PyUnicode_InsertThousandsGrouping.

........
  r63078 | eric.smith | 2008-05-11 15:52:48 -0400 (Sun, 11 May 2008) | 14 lines

  Addresses issue 2802: 'n' formatting for integers.

  Adds 'n' as a format specifier for integers, to mirror the same
  specifier which is already available for floats.  'n' is the same as
  'd', but inserts the current locale-specific thousands grouping.

  I added this as a stringlib function, but it's only used by str type,
  not unicode.  This is because of an implementation detail in
  unicode.format(), which does its own str->unicode conversion.  But the
  unicode version will be needed in 3.0, and it may be needed by other
  code eventually in 2.6 (maybe decimal?), so I left it as a stringlib
  implementation.  As long as the unicode version isn't instantiated,
  there's no overhead for this.
........
2008-05-11 21:00:57 +00:00