mirror of
https://github.com/erg-lang/erg.git
synced 2025-07-24 05:26:24 +00:00
4 KiB
4 KiB
Architecture of ergc
1. Scan an Erg script (.er) and generate a TokenStream
src: erg_parser/lex.rs
- parser/lexer/Lexer generates
TokenStream
(this is an iterator ofToken
,TokenStream
can be generated byLexer::collect()
)Lexer
is constructed fromLexer::new
orLexer::from_str
, whereLexer::new
reads the code from a file or command option.Lexer
can generate tokens sequentially as an iterator; if you want to get aTokenStream
all at once, useLexer::lex
.Lexer
outputsLexError
s as errors, butLexError
does not have enough information to display itself. If you want to display the error, use theLexerRunner
to convert the error.LexerRunner
can also be used if you want to useLexer
as standalone;Lexer
is just an iterator and does not implement theRunnable
trait.Runnable
is implemented byLexerRunner
,ParserRunner
,Compiler
, andDummyVM
.
2. Convert TokenStream
-> AST
src: erg_parser/parse.rs
Parser
, likeLexer
, has two constructors,Parser::new
andParser::from_str
, andParser::parse
will give theAST
.AST
is the wrapper type ofVec<Expr>
. It is for "Abstract Syntax Tree".
2.1 Desugaring AST
- expand nested vars (
Desugarer::desugar_nest_vars_pattern
) - desugar multiple pattern definition syntax (
Desugarer::desugar_multiple_pattern_def
)
2.2 Reordering & Linking AST
- link class methods to class definitions
- method definitions are allowed outside of the class definition file
- current implementation is incomplete, only in the same file
3. Convert AST
-> HIR
(main) src: erg_compiler/lower.rs
3.1 Name Resolution
In the current implementation it is done during type checking.
- All ASTs (including imported modules) are scanned for name resolution before type inference.
- In addition to performing cycle checking and reordering, a context is created for type inference (however, most of the information on variables registered in this context is not yet finalized).
3.2 import resolution
- When
import
is called, a new thread is created for analysis. JoinHandle
is stored inSharedCompilerResource
and is joined when the module is needed.- Unused modules may not be joined, but currently all such modules are also analyzed.
3.3 Type checking & inference
HIR
has every variable's type information. It is for "High-level Intermediate Representation".ASTLowerer
can be constructed in the same way asParser
andLexer
.ASTLowerer::lower
will output eitherCompleteArtifact
orIncompleteArtifact
. Both hasHIR
andLowerWarnings
.IncompleteArtifact
also hasLowerErrors
.ASTLowerer
is owned byCompiler
. Unlike other structures (Lexer
,Parser
, etc.),ASTLowerer
handles code contexts and is not a one-time disposable.- For type inference algorithms, see inference.md.
4. Check side-effects
src: erg_compiler/effectcheck.rs
5. Check ownerships
src: erg_compiler/ownercheck.rs
6. Optimize HIR
- Eliminate dead code (unused variables, imports, etc.)
7. Link
- Load all modules, resolve dependencies, and combine into a single HIR
8. Desugar HIR
src: erg_compiler/desugar_hir.rs
Convert parts that are not consistent with Python syntax
- Convert class member variables to functions