cpython

mirror of https://github.com/python/cpython.git synced 2025-07-07 19:35:27 +00:00

Author	SHA1	Message	Date
Pablo Galindo Salgado	ff2b5f40c2	gh-130077: Properly match full soft keywords in the parser (#135317 )	2025-06-10 14:19:03 +01:00
Bénédikt Tran	754e7c9b51	gh-133157: remove usage of `_Py_NO_SANITIZE_UNDEFINED` in `Parser/pegen.c` (#134048 ) Some checks are pending Tests / All required checks pass (push) Blocked by required conditions Details Lint / lint (push) Waiting to run Details Tests / (push) Blocked by required conditions Details Tests / Change detection (push) Waiting to run Details Tests / Docs (push) Blocked by required conditions Details Tests / Check if Autoconf files are up to date (push) Blocked by required conditions Details Tests / Windows MSI (push) Blocked by required conditions Details Tests / WASI (push) Blocked by required conditions Details Tests / Hypothesis tests on Ubuntu (push) Blocked by required conditions Details Tests / Address sanitizer (push) Blocked by required conditions Details Tests / Check if generated files are up to date (push) Blocked by required conditions Details Tests / Ubuntu SSL tests with OpenSSL (push) Blocked by required conditions Details Tests / Cross build Linux (push) Blocked by required conditions Details Tests / CIFuzz (push) Blocked by required conditions Details mypy / Run mypy on Lib/_pyrepl (push) Waiting to run Details mypy / Run mypy on Lib/test/libregrtest (push) Waiting to run Details mypy / Run mypy on Lib/tomllib (push) Waiting to run Details mypy / Run mypy on Tools/build (push) Waiting to run Details mypy / Run mypy on Tools/cases_generator (push) Waiting to run Details mypy / Run mypy on Tools/clinic (push) Waiting to run Details mypy / Run mypy on Tools/jit (push) Waiting to run Details mypy / Run mypy on Tools/peg_generator (push) Waiting to run Details	2025-06-10 01:08:30 +01:00
TERESH1	d9b0b07098	gh-133516: Raise `ValueError` when constants `True`, `False` or `None` are used as an identifier after NFKC normalization (#133523 )	2025-05-07 19:11:25 +01:00
Pablo Galindo Salgado	bf3a0a1c0f	gh-132449: Improve syntax error messages for keywords with typos (#132450 ) Signed-off-by: Pablo Galindo <pablogsal@gmail.com> Co-authored-by: Łukasz Langa <lukasz@langa.pl>	2025-04-22 11:01:55 +02:00
Victor Stinner	3796884528	gh-111178: Skip undefined behavior checks in _PyPegen_lookahead() (#131714 ) For example, expression_rule() return type is 'expr_ty', whereas _PyPegen_lookahead() uses 'void*'.	2025-03-27 10:03:58 +01:00
rialbat	2c686a9ac2	gh-131762: Fixed dereferencing the pointer 'parser_token->metadata' with a NULL value (#131764 )	2025-03-26 18:44:56 +00:00
Mark Shannon	a1aeec61c4	GH-131238: Core header refactor (GH-131250) * Moves most structs in pycore_ header files into pycore_structs.h and pycore_runtime_structs.h * Removes many cross-header dependencies	2025-03-17 09:19:04 +00:00
Sergey Miryanov	3a7f17c7e2	gh-130790: Remove references about unicode's readiness from comments (#130801 )	2025-03-03 19:18:09 +00:00
Victor Stinner	3bebe46d34	gh-128911: Add PyImport_ImportModuleAttr() function (#128912 ) Add PyImport_ImportModuleAttr() and PyImport_ImportModuleAttrString() functions. * Add unit tests. * Replace _PyImport_GetModuleAttr() with PyImport_ImportModuleAttr(). * Replace _PyImport_GetModuleAttrString() with PyImport_ImportModuleAttrString(). * Remove "pycore_import.h" includes, no longer needed.	2025-01-30 11:17:29 +00:00
Victor Stinner	3aff1d0260	gh-124064: Fix -Wconversion warnings in Parser/pegen.c (#124181 )	2024-09-17 15:58:43 +00:00
Lysandros Nikolaou	ce0d66c8d2	gh-122581: Avoid data races when collecting parser statistics (#122694 )	2024-08-06 13:29:57 +02:00
Serhiy Storchaka	6c09b8de5c	gh-122270: Fix typos in the Py_DEBUG macro name (GH-122271)	2024-07-25 14:04:22 +03:00
Pablo Galindo Salgado	ac61d58db0	gh-119521: Rename IncompleteInputError to _IncompleteInputError and remove from public API/ABI (GH-119680) Signed-off-by: Pablo Galindo <pablogsal@gmail.com> Co-authored-by: Petr Viktorin <encukou@gmail.com>	2024-06-24 14:08:12 +02:00
Petr Viktorin	6f1d448bc1	gh-113993: Allow interned strings to be mortal, and fix related issues (GH-120520) * Add an InternalDocs file describing how interning should work and how to use it. * Add internal functions to explicitly request what kind of interning is done: - `_PyUnicode_InternMortal` - `_PyUnicode_InternImmortal` - `_PyUnicode_InternStatic` * Switch uses of `PyUnicode_InternInPlace` to those. * Disallow using `_Py_SetImmortal` on strings directly. You should use `_PyUnicode_InternImmortal` instead: - Strings should be interned before immortalization, otherwise you're possibly interning a immortalizing copy. - `_Py_SetImmortal` doesn't handle the `SSTATE_INTERNED_MORTAL` to `SSTATE_INTERNED_IMMORTAL` update, and those flags can't be changed in backports, as they are now part of public API and version-specific ABI. * Add private `_only_immortal` argument for `sys.getunicodeinternedsize`, used in refleak test machinery. * Make sure the statically allocated string singletons are unique. This means these sets are now disjoint: - `_Py_ID` - `_Py_STR` (including the empty string) - one-character latin-1 singletons Now, when you intern a singleton, that exact singleton will be interned. * Add a `_Py_LATIN1_CHR` macro, use it instead of `_Py_ID`/`_Py_STR` for one-character latin-1 singletons everywhere (including Clinic). * Intern `_Py_STR` singletons at startup. * For free-threaded builds, intern `_Py_LATIN1_CHR` singletons at startup. * Beef up the tests. Cover internal details (marked with `@cpython_only`). * Add lots of assertions Co-Authored-By: Eric Snow <ericsnowcurrently@gmail.com>	2024-06-21 17:19:31 +02:00
Lysandros Nikolaou	d87b015106	gh-119118: Fix performance regression in tokenize module (#119615 ) * gh-119118: Fix performance regression in tokenize module - Cache line object to avoid creating a Unicode object for all of the tokens in the same line. - Speed up byte offset to column offset conversion by using the smallest buffer possible to measure the difference. Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2024-05-28 19:17:49 +00:00
Pablo Galindo Salgado	39d102c2ee	gh-113744: Add a new IncompleteInputError exception to improve incomplete input detection in the codeop module (#113745 ) Signed-off-by: Pablo Galindo <pablogsal@gmail.com>	2024-01-30 16:21:30 +00:00
Pablo Galindo Salgado	a135a6d2c6	gh-112943: Correctly compute end offsets for multiline tokens in the tokenize module (#112949 )	2023-12-11 11:44:22 +00:00
Pablo Galindo Salgado	e1d8c65e1d	gh-110805: Allow the repl to show source code and complete tracebacks (#110775 )	2023-10-13 09:25:37 +00:00
Lysandros Nikolaou	01481f2dc1	gh-104169: Refactor tokenizer into lexer and wrappers (#110684 ) * The lexer, which include the actual lexeme producing logic, goes into the `lexer` directory. * The wrappers, one wrapper per input mode (file, string, utf-8, and readline), go into the `tokenizer` directory and include logic for creating a lexer instance and managing the buffer for different modes. --------- Co-authored-by: Pablo Galindo <pablogsal@gmail.com> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>	2023-10-11 15:14:44 +00:00
Pablo Galindo Salgado	da8f87b7ea	gh-107015: Remove async_hacks from the tokenizer (#107018 )	2023-07-26 16:34:15 +01:00
Victor Stinner	8c5f74fc89	gh-106023: Update code using _PyObject_FastCall() (#106257 ) Replace _PyObject_FastCall() calls with PyObject_Vectorcall().	2023-06-30 01:05:01 +00:00
Marta Gómez Macías	96fff35325	gh-105017: Include CRLF lines in strings and column numbers (#105030 ) Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2023-05-28 15:15:53 +01:00
Marta Gómez Macías	6715f91edc	gh-102856: Python tokenizer implementation for PEP 701 (#104323 ) This commit replaces the Python implementation of the tokenize module with an implementation that reuses the real C tokenizer via a private extension module. The tokenize module now implements a compatibility layer that transforms tokens from the C tokenizer into Python tokenize tokens for backward compatibility. As the C tokenizer does not emit some tokens that the Python tokenizer provides (such as comments and non-semantic newlines), a new special mode has been added to the C tokenizer mode that currently is only used via the extension module that exposes it to the Python layer. This new mode forces the C tokenizer to emit these new extra tokens and add the appropriate metadata that is needed to match the old Python implementation. Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2023-05-21 01:03:02 +01:00
Lysandros Nikolaou	9169a56fad	gh-103656: Transfer f-string buffers to parser to avoid use-after-free (GH-103896) Co-authored-by: Pablo Galindo <pablogsal@gmail.com>	2023-04-27 01:33:31 +00:00
Pablo Galindo Salgado	1ef61cf71a	gh-102856: Initial implementation of PEP 701 (#102855 ) Co-authored-by: Lysandros Nikolaou <lisandrosnik@gmail.com> Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Co-authored-by: Marta Gómez Macías <mgmacias@google.com> Co-authored-by: sunmy2019 <59365878+sunmy2019@users.noreply.github.com>	2023-04-19 11:18:16 -05:00
Chenxi Mao	7703def37e	GH-102711: Fix warnings found by clang (#102712 ) There are some warnings if build python via clang: Parser/pegen.c:812:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes] _PyPegen_clear_memo_statistics() ^ void Parser/pegen.c:820:29: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes] _PyPegen_get_memo_statistics() ^ void Fix it to make clang happy. Signed-off-by: Chenxi Mao <chenxi.mao@suse.com>	2023-03-28 10:52:22 +02:00
Mark Shannon	feec49c407	GH-101578: Normalize the current exception (GH-101607) * Make sure that the current exception is always normalized. * Remove redundant type and traceback fields for the current exception. * Add new API functions: PyErr_GetRaisedException, PyErr_SetRaisedException * Add new API functions: PyException_GetArgs, PyException_SetArgs	2023-02-08 09:31:12 +00:00
Eric Snow	91a8e002c2	gh-81057: Move More Globals to _PyRuntimeState (gh-100092) https://github.com/python/cpython/issues/81057	2022-12-07 15:56:31 -07:00
Victor Stinner	4ce2a202c7	gh-99300: Use Py_NewRef() in Parser/ directory (#99330 ) Replace Py_INCREF() with Py_NewRef() in C files of the Parser/ directory and in the PEG generator.	2022-11-10 15:30:05 +01:00
Lysandros Nikolaou	cbf0afd8a1	gh-97973: Return all necessary information from the tokenizer (GH-97984) Right now, the tokenizer only returns type and two pointers to the start and end of the token. This PR modifies the tokenizer to return the type and set all of the necessary information, so that the parser does not have to this.	2022-10-06 16:07:17 -07:00
Gregory P. Smith	511ca94520	gh-95778: CVE-2020-10735: Prevent DoS by very large int() (#96499 ) Integer to and from text conversions via CPython's bignum `int` type is not safe against denial of service attacks due to malicious input. Very large input strings with hundred thousands of digits can consume several CPU seconds. This PR comes fresh from a pile of work done in our private PSRT security response team repo. Signed-off-by: Christian Heimes [Red Hat] <christian@python.org> Tons-of-polishing-up-by: Gregory P. Smith [Google] <greg@krypto.org> Reviews via the private PSRT repo via many others (see the NEWS entry in the PR). <!-- gh-issue-number: gh-95778 --> * Issue: gh-95778 <!-- /gh-issue-number --> I wrote up [a one pager for the release managers](https://docs.google.com/document/d/1KjuF_aXlzPUxTK4BMgezGJ2Pn7uevfX7g0_mvgHlL7Y/edit#). Much of that text wound up in the Issue. Backports PRs already exist. See the issue for links.	2022-09-02 09:35:08 -07:00
Honglin Zhu	b946f529ef	gh-95355: Check tokens[0] after allocating memory (GH-95356) #95355 Automerge-Triggered-By: GH:pablogsal	2022-07-28 03:00:34 -07:00
Serhiy Storchaka	6fd4c8ec77	gh-93741: Add private C API _PyImport_GetModuleAttrString() (GH-93742) It combines PyImport_ImportModule() and PyObject_GetAttrString() and saves 4-6 lines of code on every use. Add also _PyImport_GetModuleAttr() which takes Python strings as arguments.	2022-06-14 07:15:26 +03:00
Victor Stinner	5115a16831	gh-93103: Parser uses PyConfig.parser_debug instead of Py_DebugFlag (#93106 ) * Replace deprecated Py_DebugFlag with PyConfig.parser_debug in the parser. * Add Parser.debug member. * Add tok_state.debug member. * Py_FrozenMain(): Replace Py_VerboseFlag with PyConfig.verbose.	2022-05-24 22:35:08 +02:00
Oleg Iarygin	a52f82baf2	bpo-46920: Remove disabled debug code added decades ago and likely unnecessary (GH-31812)	2022-03-14 17:03:21 +01:00
Pablo Galindo Salgado	e19059ecd8	Don't print rejected tokens when using the debug flags in the parser (GH-31258)	2022-02-10 14:38:27 +00:00
Pablo Galindo Salgado	390459de6d	Allow the parser to avoid nested processing of invalid rules (GH-31252)	2022-02-10 13:12:14 +00:00
Pablo Galindo Salgado	69e10976b2	bpo-46521: Fix codeop to use a new partial-input mode of the parser (GH-31010)	2022-02-08 11:54:37 +00:00
Pablo Galindo Salgado	6fa8b2ceee	bpo-46237: Fix the line number of tokenizer errors inside f-strings (GH-30463)	2022-01-08 00:23:40 +00:00
Pablo Galindo Salgado	dd6c35761a	bpo-46110: Restore commit `e9898bf153` This restores commit `e9898bf153` .	2022-01-03 19:54:06 +00:00
Pablo Galindo Salgado	9d35dedc5e	Revert "bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177)" (GH-30363) This reverts commit `e9898bf153` temporarily as we want to confirm if this commit is the cause of a slowdown at startup time.	2022-01-03 18:29:18 +00:00
Pablo Galindo Salgado	e9898bf153	bpo-46110: Add a recursion check to avoid stack overflow in the PEG parser (GH-30177) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>	2021-12-20 15:43:26 +00:00
Kumar Aditya	41026c3155	bpo-45855: Replaced deprecated `PyImport_ImportModuleNoBlock` with PyImport_ImportModule (GH-30046)	2021-12-12 10:45:20 +02:00
Weipeng Hong	28179aac79	bpo-42918: Improve build-in function compile() in mode 'single' (GH-29934) Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>	2021-12-11 00:44:26 +01:00
Pablo Galindo Salgado	24c10d2943	bpo-45727: Only trigger the 'did you forgot a comma' error suggestion if inside parentheses (GH-29757)	2021-11-24 22:21:23 +00:00
Pablo Galindo Salgado	c9c4444d9f	Refactor parser compilation units into specific components (GH-29676)	2021-11-21 01:08:50 +00:00
Pablo Galindo Salgado	79ff0d1687	bpo-45494: Fix error location in EOF tokenizer errors (GH-29108)	2021-11-20 17:40:59 +00:00
Pablo Galindo Salgado	fdcc46d955	bpo-45848: Allow the parser to get error lines from encoded files (GH-29646)	2021-11-20 15:36:07 +01:00
Pablo Galindo Salgado	546cefcda7	bpo-45727: Make the syntax error for missing comma more consistent (GH-29427)	2021-11-19 23:11:57 +00:00
Pablo Galindo Salgado	da20d7401d	bpo-45822: Respect PEP 263's coding cookies in the parser even if flags are not provided (GH-29582)	2021-11-16 12:30:47 -08:00

1 2 3

118 commits