mirror of
https://github.com/erg-lang/erg.git
synced 2025-09-27 19:59:07 +00:00
4 KiB
4 KiB
Architecture of ergc
1. Scan an Erg script (.er) and generate a TokenStream
src: erg_parser/lex.rs
- parser/lexer/Lexer generates
TokenStream(this is an iterator ofToken,TokenStreamcan be generated byLexer::collect())Lexeris constructed fromLexer::neworLexer::from_str, whereLexer::newreads the code from a file or command option.Lexercan generate tokens sequentially as an iterator; if you want to get aTokenStreamall at once, useLexer::lex.LexeroutputsLexErrors as errors, butLexErrordoes not have enough information to display itself. If you want to display the error, use theLexerRunnerto convert the error.LexerRunnercan also be used if you want to useLexeras standalone;Lexeris just an iterator and does not implement theRunnabletrait.Runnableis implemented byLexerRunner,ParserRunner,Compiler, andDummyVM.
2. Convert TokenStream -> AST
src: erg_parser/parse.rs
Parser, likeLexer, has two constructors,Parser::newandParser::from_str, andParser::parsewill give theAST.ASTis the wrapper type ofVec<Expr>. It is for "Abstract Syntax Tree".
2.1 Desugaring AST
- expand nested vars (
Desugarer::desugar_nest_vars_pattern) - desugar multiple pattern definition syntax (
Desugarer::desugar_multiple_pattern_def)
2.2 Reordering & Linking AST
- link class methods to class definitions
- method definitions are allowed outside of the class definition file
- current implementation is incomplete, only in the same file
3. Convert AST -> HIR
(main) src: erg_compiler/lower.rs
3.1 Name Resolution
In the current implementation it is done during type checking.
- All ASTs (including imported modules) are scanned for name resolution before type inference.
- In addition to performing cycle checking and reordering, a context is created for type inference (however, most of the information on variables registered in this context is not yet finalized).
3.2 import resolution
- When
importis called, a new thread is created for analysis. JoinHandleis stored inSharedCompilerResourceand is joined when the module is needed.- Unused modules may not be joined, but currently all such modules are also analyzed.
3.3 Type checking & inference
HIRhas every variable's type information. It is for "High-level Intermediate Representation".ASTLowerercan be constructed in the same way asParserandLexer.ASTLowerer::lowerwill output eitherCompleteArtifactorIncompleteArtifact. Both hasHIRandLowerWarnings.IncompleteArtifactalso hasLowerErrors.ASTLowereris owned byCompiler. Unlike other structures (Lexer,Parser, etc.),ASTLowererhandles code contexts and is not a one-time disposable.- For type inference algorithms, see inference.md.
4. Check side-effects
src: erg_compiler/effectcheck.rs
5. Check ownerships
src: erg_compiler/ownercheck.rs
6. Optimize HIR
- Eliminate dead code (unused variables, imports, etc.)
7. Link
- Load all modules, resolve dependencies, and combine into a single HIR
8. Desugar HIR
src: erg_compiler/desugar_hir.rs
Convert parts that are not consistent with Python syntax
- Convert class member variables to functions