Commit graph

218 commits

Author SHA1 Message Date
RUANG (James Roy)
9f25c1f012
gh-46236: Add docs for PyUnicode_GetDefaultEncoding() doc (GH-130335)
* Clarify sys.getdefaultencoding() documentation

* Add missing documentation for PyUnicode_GetDefaultEncoding,
  the C equivalent of sys.getdefaultencoding
2025-02-24 15:37:21 +01:00
Marc Mueller
0f5b82169e
gh-46236: Document PyUnicode_RSplit, PyUnicode_Partition and PyUnicode_RPartition (#130191)
Co-authored-by: Petr Viktorin <encukou@gmail.com>
2025-02-20 16:41:41 +01:00
Stan Ulbrych
3402e133ef
gh-82045: Correct and deduplicate "isprintable" docs; add test. (GH-130118)
We had the definition of what makes a character "printable" documented in three places, giving two different definitions.
The definition in the comment on `_PyUnicode_IsPrintable` was inverted; correct that.

With that correction, the two definitions turn out to be equivalent -- but to confirm that, you have to go look up, or happen to know, that those are the only five "Other" categories and only three "Separator" categories in the Unicode character database.  That makes it hard for the reader to tell whether they really are the same, or if there's some subtle difference in the intended semantics.

Fix that by cutting the C API docs' and the C comment's copies of the subtle details, in favor of referring to the Python-level docs. That ensures it's explicit that these are all meant to agree, and also lets us concentrate improvements to the wording in one place.

Speaking of which, borrow some ideas from the C comment, along with other tweaks, to hopefully add a bit more clarity to that one newly-centralized copy in the docs.

Also add a thorough test that the implementation agrees with this definition.

Author:    Greg Price <gnprice@gmail.com>

Co-authored-by: Greg Price <gnprice@gmail.com>
2025-02-14 18:16:47 +01:00
Yuki Kobayashi
8d9d3e4ecb
gh-46236: Document PyUnicode_DecodeCodePageStateful (GH-127934)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
2025-02-10 17:17:37 +01:00
Peter Bierma
e792f4bc2e
Docs C API: Clarify what happens when null bytes are passed to PyUnicode_AsUTF8 (#127458)
Co-authored-by: Stan U. <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
2025-01-20 16:54:29 +01:00
Serhiy Storchaka
657d7b77e5
gh-90241: Clarify documentation for PyUnicode_FSConverter and PyUnicode_FSDecoder (GH-128451)
Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
2025-01-06 13:28:50 +01:00
Victor Stinner
1ef6e8ca3f
gh-119182: Complete PyUnicodeWriter documentation (#127607) 2024-12-05 10:37:14 +01:00
Hugo van Kemenade
8cdaca8b25 Python 3.14.0a1 2024-10-15 22:34:54 +03:00
Victor Stinner
1b2a5485f9
gh-125196: PyUnicodeWriter_Discard(NULL) does nothing (#125222) 2024-10-09 23:32:02 +00:00
Victor Stinner
a7f0727ca5
gh-124502: Add PyUnicode_Equal() function (#124504) 2024-10-07 21:24:53 +00:00
Victor Stinner
d8cf587dc7
doc: PyUnicode_AsUTF8String() fails if string contains surrogates (#124605) 2024-09-27 20:13:29 +00:00
Max Bachmann
b79a21ea42
GH-95079: document error behaviour for some unicode C APIs (#95080) 2024-09-27 12:35:55 +02:00
Petr Viktorin
7d24ea9db3
gh-121277: Allow .. versionadded:: next in docs (GH-121278)
Make `versionchanged:: next`` expand to current (unreleased) version.

When a new CPython release is cut, the release manager will replace
all such occurences of "next" with the just-released version.
(See the issue for release-tools and devguide PRs.)

Co-authored-by: Adam Turner <9087854+AA-Turner@users.noreply.github.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
2024-09-25 23:30:40 +02:00
Petr Viktorin
b4aedb23ae
gh-113993: Don't immortalize in PyUnicode_InternInPlace; keep immortalizing in other API (#121364)
* Switch PyUnicode_InternInPlace to _PyUnicode_InternMortal, clarify docs

* Document immortality in some functions that take `const char *`

This is PyUnicode_InternFromString;
PyDict_SetItemString, PyObject_SetAttrString;
PyObject_DelAttrString; PyUnicode_InternFromString;
and the PyModule_Add convenience functions.

Always point out a non-immortalizing alternative.

* Don't immortalize user-provided attr names in _ctypes
2024-07-16 15:36:21 +02:00
Victor Stinner
2e157851e3
gh-119182: Add PyUnicodeWriter_WriteUCS4() function (#120849) 2024-06-24 17:40:39 +02:00
Victor Stinner
4123226bbd
gh-119182: Add PyUnicodeWriter_DecodeUTF8Stateful() (#120639)
Add PyUnicodeWriter_WriteWideChar() and
PyUnicodeWriter_DecodeUTF8Stateful() functions.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2024-06-21 19:33:15 +02:00
Victor Stinner
5c4235cd8c
gh-119182: Add PyUnicodeWriter C API (#119184) 2024-06-17 17:10:52 +02:00
Serhiy Storchaka
24a2bd0481
gh-117642: Fix PEP 737 implementation (GH-117643)
* Fix implementation of %#T and %#N (they were implemented as %T# and
  %N#).
* Restore tests removed in gh-116417.
2024-04-08 16:27:25 +00:00
Victor Stinner
7bbb9b57e6
gh-111696, PEP 737: Add %T and %N to PyUnicode_FromFormat() (#116839) 2024-03-14 22:23:00 +00:00
qqwqqw689
5719aa23ab
gh-113437: Update documentation about PyUnicode_AsWideChar() function (GH-113455) 2024-02-13 15:23:10 +01:00
Rune Tynan
b31232ddf7
gh-62897: Update PyUnicode C API parameter names (GH-12680)
Standardize PyUnicode C API parameter names across the documentation.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-12-05 11:21:09 +02:00
Victor Stinner
11e83488c5
gh-111089: Revert PyUnicode_AsUTF8() changes (#111833)
* Revert "gh-111089: Use PyUnicode_AsUTF8() in Argument Clinic (#111585)"

This reverts commit d9b606b3d0.

* Revert "gh-111089: Use PyUnicode_AsUTF8() in getargs.c (#111620)"

This reverts commit cde1071b2a.

* Revert "gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)"

This reverts commit d731579bfb.

* Revert "gh-111089: Add PyUnicode_AsUTF8() to the limited C API (#111121)"

This reverts commit d8f32be5b6.

* Revert "gh-111089: Use PyUnicode_AsUTF8() in sqlite3 (#111122)"

This reverts commit 37e4e20eaa.
2023-11-07 22:36:13 +00:00
Victor Stinner
f1e751e933
gh-111089: PyUnicode_AsUTF8AndSize() sets size on error (#111106)
On error, PyUnicode_AsUTF8AndSize() now sets the size argument to -1,
to avoid undefined value.
2023-10-20 20:03:11 +02:00
Victor Stinner
d731579bfb
gh-111089: PyUnicode_AsUTF8() now raises on embedded NUL (#111091)
* PyUnicode_AsUTF8() now raises an exception if the string contains
  embedded null characters.
* Update related C API tests (test_capi.test_unicode).
* type_new_set_doc() uses PyUnicode_AsUTF8AndSize() to silently
  truncate doc containing null bytes.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
2023-10-20 17:59:29 +02:00
Serhiy Storchaka
eb50cd37ea
gh-110289: C API: Add PyUnicode_EqualToUTF8() and PyUnicode_EqualToUTF8AndSize() functions (GH-110297) 2023-10-11 16:41:58 +03:00
Serhiy Storchaka
d7202e4879
gh-107298: Fix numerous ref errors and typos in the C API docs (GH-108258) 2023-08-22 15:50:30 +03:00
Eric Snow
5dc825d504
gh-98154: Clarify Usage of "Reference Count" In the Docs (gh-107552)
PEP 683 (immortal objects) revealed some ways in which the Python documentation has been unnecessarily coupled to the implementation details of reference counts.  In the end users should focus on reference ownership, including taking references and releasing them, rather than on how many reference counts an object has.

This change updates the documentation to reflect that perspective.  It also updates the docs relative to immortal objects in a handful of places.
2023-08-07 15:40:59 -06:00
Victor Stinner
8d61a71f9c
gh-107298: Fix more Sphinx warnings in the C API doc (#107329)
Declare the following functions as macros, since they are actually
macros. It avoids a warning on "TYPE" or "macro" argument.

* PyMem_New()
* PyMem_Resize()
* PyModule_AddIntMacro()
* PyModule_AddStringMacro()
* PyObject_GC_New()
* PyObject_GC_NewVar()
* PyObject_New()
* PyObject_NewVar()

Add C standard C types to nitpick_ignore in Doc/conf.py:

* int64_t
* uint64_t
* uintptr_t

No longer ignore non existing "__int" type in nitpick_ignore.

Update Doc/tools/.nitignore
2023-07-27 00:52:40 +00:00
Victor Stinner
391e03fa05
gh-107298: Fix Sphinx warnings in the C API doc (#107302)
* Update Doc/tools/.nitignore
* Fix BufferedIOBase.write() link in buffer.rst
2023-07-27 01:41:15 +02:00
Victor Stinner
87b39028e5
gh-107298: Fix doc references to undocumented modules (#107300)
Update also Doc/tools/.nitignore.
2023-07-26 18:59:06 +02:00
Serhiy Storchaka
f8b7fe2f26
gh-106948: Add standard external names to nitpick_ignore (GH-106949)
It includes standard C types, macros and variables like "size_t",
"LONG_MAX" and "errno", and standard environment variables like "PATH".
2023-07-22 21:35:22 +03:00
Serhiy Storchaka
fcc816dbff
gh-106919: Use role :c:macro: for referencing the C "constants" (GH-106920) 2023-07-21 10:52:07 +03:00
Victor Stinner
04181965cf
gh-105156: Update Unicode C API: remove deprecation (#105379)
_PyUnicode_ToLowercase(), _PyUnicode_ToUppercase(),
_PyUnicode_ToTitlecase() are no longer deprecated in the
documentation. It's no longer needed since they now use Py_UCS4 type,
rather than the deprecated Py_UNICODE type.
2023-06-06 16:42:49 +02:00
Victor Stinner
bae415ad02
gh-102304: doc: Add links to Stable ABI and Limited C API (#105345)
* Add "limited-c-api" and "stable-api" references.
* Rename "stable-abi-list" reference to "limited-api-list".
* Makefile: Document files regenerated by "make regen-limited-abi"
* Remove first empty line in generated files:

  - Lib/test/test_stable_abi_ctypes.py
  - PC/python3dll.c
2023-06-06 08:40:32 +00:00
Victor Stinner
8ed705c083
gh-105156: Deprecate the old Py_UNICODE type in C API (#105157)
Deprecate the old Py_UNICODE and PY_UNICODE_TYPE types in the C API:
use wchar_t instead.

Replace Py_UNICODE with wchar_t in multiple C files.

Co-authored-by: Inada Naoki <songofacandy@gmail.com>
2023-06-01 08:56:35 +02:00
Serhiy Storchaka
f3466bc040
gh-98836: Extend PyUnicode_FromFormat() (GH-98838)
* Support for conversion specifiers o (octal) and X (uppercase hexadecimal).
* Support for length modifiers j (intmax_t) and t (ptrdiff_t).
* Length modifiers are now applied to all integer conversions.
* Support for wchar_t C strings (%ls and %lV).
* Support for variable width and precision (*).
* Support for flag - (left alignment).
2023-05-22 00:32:39 +03:00
Inada Naoki
ce2383ec66
gh-103883: Doc: Move PyUnicode_FromObject doc (#103913)
This API is one of Unicode creator APIs.
2023-04-27 14:53:11 +09:00
Adam Turner
0031e62973
gh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (#97768)
:c:type:`<C type>` -> :c:expr:`<C type>`

Co-authored-by: Łukasz Langa <lukasz@langa.pl>
2022-10-05 11:01:14 -07:00
Adam Turner
9ebc50866b
gh-93738: Documentation C syntax (:c:type:PyBytesObject* -> :c:expr:PyBytesObject*) (#97782)
:c:type:`PyBytesObject*` -> :c:expr:`PyBytesObject*`
2022-10-04 16:11:34 -07:00
Adam Turner
898834e27b
gh-93738: Documentation C syntax (:c:type:PyUnicodeObject* -> :c:expr:PyUnicodeObject*) (#97783)
:c:type:`PyUnicodeObject*` -> :c:expr:`PyUnicodeObject*`
2022-10-04 16:11:20 -07:00
Serhiy Storchaka
62f06508e7
gh-95781: More strict format string checking in PyUnicode_FromFormatV() (GH-95784)
An unrecognized format character in PyUnicode_FromFormat() and
PyUnicode_FromFormatV() now sets a SystemError.
In previous versions it caused all the rest of the format string to be
copied as-is to the result string, and any extra arguments discarded.
2022-08-08 19:21:07 +03:00
Pamela Fox
70068b9336
Fix Unicode doc and replace use of macro with PyMem_New function (GH-94088) 2022-07-28 23:32:16 +01:00
Victor Stinner
71d8775fee
gh-93202: Always use %zd printf formatter (#93201)
Python now always use the ``%zu`` and ``%zd`` printf formats to
format a size_t or Py_ssize_t number. Building Python 3.12 requires a
C11 compiler, so these printf formats are now always supported.

* PyObject_Print() and _PyObject_Dump() now use the printf %zd format
  to display an object reference count.
* Update PY_FORMAT_SIZE_T comment.
* Remove outdated notes about the %zd format in PyBytes_FromFormat()
  and PyUnicode_FromFormat() documentations.
* configure no longer checks for the %zd format and no longer defines
  PY_FORMAT_SIZE_T macro in pyconfig.h.
* pymacconfig.h no longer undefines PY_FORMAT_SIZE_T: macOS 10.4 is
  no longer supported. Python 3.12 now requires macOS 10.6 (Snow
  Leopard) or newer.
2022-05-25 14:21:36 +02:00
Victor Stinner
fc00667247
gh-93103: Update PyUnicode_DecodeFSDefault() doc (#93105)
Update documentation of PyUnicode_DecodeFSDefault(),
PyUnicode_DecodeFSDefaultAndSize() and PyUnicode_EncodeFSDefault():
they now use the filesystem encoding and error handler of PyConfig,
Py_FileSystemDefaultEncoding and Py_FileSystemDefaultEncodeErrors
variables are no longer used.
2022-05-23 14:56:59 +02:00
Julien Palard
664aa94b57
Document Py_ssize_t. (GH-92512)
It fixes 252 errors from a Sphinx nitpicky run (sphinx-build -n). But
there's 8182 errors left.

Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com>
2022-05-13 14:10:16 +02:00
Inada Naoki
e371d5d5d1
gh-92536: Doc update about Py_UNICODE removal (GH-92756) 2022-05-13 13:15:41 +09:00
Inada Naoki
f9c9354a7a
gh-92536: PEP 623: Remove wstr and legacy APIs from Unicode (GH-92537) 2022-05-12 14:48:38 +09:00
Victor Stinner
1a9645f537
gh-89653: PEP 670: Fix Sphinx syntax in Unicode doc (#92707) 2022-05-12 03:38:49 +02:00
Victor Stinner
d0c9353a79
gh-89653: PEP 670: unicodeobject.h uses _Py_CAST() (#92696)
Use _Py_CAST() and _Py_STATIC_CAST() in macros wrapping static inline
functions of unicodeobject.h.

Change also the kind type from unsigned int to int: same parameter
type than PyUnicode_FromKindAndData().

The limited API version 3.11 no longer casts arguments to expected
types.
2022-05-12 01:35:41 +02:00
Victor Stinner
92f0ed1d90
gh-89653: PEP 670: Update C API unicode documentation (#92702) 2022-05-12 01:33:52 +02:00