The cases generator inserts code to save and restore the stack pointer around statements that contain escaping calls. To find the beginning of such statements, we would walk backwards from the escaping call until we encountered a token that was treated as a statement terminator. This set of terminators should include preprocessor directives. |
||
|---|---|---|
| .. | ||
| _typing_backports.py | ||
| analyzer.py | ||
| cwriter.py | ||
| generators_common.py | ||
| interpreter_definition.md | ||
| lexer.py | ||
| mypy.ini | ||
| opcode_id_generator.py | ||
| opcode_metadata_generator.py | ||
| optimizer_generator.py | ||
| parser.py | ||
| parsing.py | ||
| plexer.py | ||
| py_metadata_generator.py | ||
| README.md | ||
| stack.py | ||
| target_generator.py | ||
| tier1_generator.py | ||
| tier2_generator.py | ||
| uop_id_generator.py | ||
| uop_metadata_generator.py | ||
Tooling to generate interpreters
Documentation for the instruction definitions in Python/bytecodes.c
("the DSL") is here.
What's currently here:
analyzer.py: code for convertingASTgenerated byParserto more high-level structure for easier interactionlexer.py: lexer for C, originally written by Mark Shannonplexer.py: OO interface on top of lexer.py; main class:PLexerparsing.py: Parser for instruction definition DSL; main class:Parserparser.pyhelper for interactions withparsing.pytierN_generator.py: a couple of driver scripts to readPython/bytecodes.cand writePython/generated_cases.c.h(and several other files)optimizer_generator.py: readsPython/bytecodes.candPython/optimizer_bytecodes.cand writesPython/optimizer_cases.c.hstack.py: code to handle generalized stack effectscwriter.py: code which understands tokens and how to format C code; main class:CWritergenerators_common.py: helpers for generatorsopcode_id_generator.py: generate a list of opcodes and write them toInclude/opcode_ids.hopcode_metadata_generator.py: reads the instruction definitions and write the metadata toInclude/internal/pycore_opcode_metadata.hpy_metadata_generator.py: reads the instruction definitions and write the metadata toLib/_opcode_metadata.pytarget_generator.py: generate targets for computed goto dispatch and write them toPython/opcode_targets.huop_id_generator.py: generate a list of uop IDs and write them toInclude/internal/pycore_uop_ids.huop_metadata_generator.py: reads the instruction definitions and write the metadata toInclude/internal/pycore_uop_metadata.h
Note that there is some dummy C code at the top and bottom of
Python/bytecodes.c
to fool text editors like VS Code into believing this is valid C code.
A bit about the parser
The parser class uses a pretty standard recursive descent scheme,
but with unlimited backtracking.
The PLexer class tokenizes the entire input before parsing starts.
We do not run the C preprocessor.
Each parsing method returns either an AST node (a Node instance)
or None, or raises SyntaxError (showing the error in the C source).
Most parsing methods are decorated with @contextual, which automatically
resets the tokenizer input position when None is returned.
Parsing methods may also raise SyntaxError, which is irrecoverable.
When a parsing method returns None, it is possible that after backtracking
a different parsing method returns a valid AST.
Neither the lexer nor the parsers are complete or fully correct.
Most known issues are tersely indicated by # TODO: comments.
We plan to fix issues as they become relevant.