mirror of
https://github.com/erg-lang/erg.git
synced 2025-09-28 20:14:45 +00:00
93 lines
4 KiB
Markdown
93 lines
4 KiB
Markdown
# Architecture of `ergc`
|
|
|
|
## 1. Scan an Erg script (.er) and generate a `TokenStream`
|
|
|
|
src: [erg_parser/lex.rs](../../../crates/erg_parser/lex.rs)
|
|
|
|
* parser/lexer/Lexer generates `TokenStream` (this is an iterator of `Token`, `TokenStream` can be generated by `Lexer::collect()`)
|
|
* `Lexer` is constructed from `Lexer::new` or `Lexer::from_str`, where `Lexer::new` reads the code from a file or command option.
|
|
* `Lexer` can generate tokens sequentially as an iterator; if you want to get a `TokenStream` all at once, use `Lexer::lex`.
|
|
* `Lexer` outputs `LexError`s as errors, but `LexError` does not have enough information to display itself. If you want to display the error, use the `LexerRunner` to convert the error.
|
|
* `LexerRunner` can also be used if you want to use `Lexer` as standalone; `Lexer` is just an iterator and does not implement the `Runnable` trait.
|
|
* `Runnable` is implemented by `LexerRunner`, `ParserRunner`, `Compiler`, and `DummyVM`.
|
|
|
|
## 2. Convert `TokenStream` -> `AST`
|
|
|
|
src: [erg_parser/parse.rs](../../../crates/erg_parser/parse.rs)
|
|
|
|
* `Parser`, like `Lexer`, has two constructors, `Parser::new` and `Parser::from_str`, and `Parser::parse` will give the `AST`.
|
|
* `AST` is the wrapper type of `Vec<Expr>`. It is for "Abstract Syntax Tree".
|
|
|
|
### 2.1 Desugaring `AST`
|
|
|
|
src: [erg_parser/desugar.rs](../../../crates/erg_parser/desugar.rs)
|
|
|
|
* expand nested vars (`Desugarer::desugar_nest_vars_pattern`)
|
|
* desugar multiple pattern definition syntax (`Desugarer::desugar_multiple_pattern_def`)
|
|
|
|
### 2.2 Reordering & Linking `AST`
|
|
|
|
src: [erg_compiler/link_ast.rs](../../../crates/erg_compiler/link_ast.rs)
|
|
|
|
* link class methods to class definitions
|
|
* method definitions are allowed outside of the class definition file
|
|
* current implementation is incomplete, only in the same file
|
|
|
|
## 3. Convert `AST` -> `HIR`
|
|
|
|
(main) src: [erg_compiler/lower.rs](../../../crates/erg_compiler/lower.rs)
|
|
|
|
### 3.1 Name Resolution
|
|
|
|
In the current implementation it is done during type checking.
|
|
|
|
* All ASTs (including imported modules) are scanned for name resolution before type inference.
|
|
* In addition to performing cycle checking and reordering, a context is created for type inference (however, most of the information on variables registered in this context is not yet finalized).
|
|
|
|
### 3.2 import resolution
|
|
|
|
* When `import` is called, a new thread is created for analysis.
|
|
* `JoinHandle` is stored in `SharedCompilerResource` and is joined when the module is needed.
|
|
* Unused modules may not be joined, but currently all such modules are also analyzed.
|
|
|
|
### 3.3 Type checking & inference
|
|
|
|
src: [erg_compiler/lower.rs](../../../crates/erg_compiler/lower.rs)
|
|
|
|
* `HIR` has every variable's type information. It is for "High-level Intermediate Representation".
|
|
* `ASTLowerer` can be constructed in the same way as `Parser` and `Lexer`.
|
|
* `ASTLowerer::lower` will output either `CompleteArtifact` or `IncompleteArtifact`. Both has `HIR` and `LowerWarnings`. `IncompleteArtifact` also has `LowerErrors`.
|
|
* `ASTLowerer` is owned by `Compiler`. Unlike other structures (`Lexer`, `Parser`, etc.), `ASTLowerer` handles code contexts and is not a one-time disposable.
|
|
* For type inference algorithms, see [inference.md](./inference.md).
|
|
|
|
## 4. Check side-effects
|
|
|
|
src: [erg_compiler/effectcheck.rs](../../../crates/erg_compiler/effectcheck.rs)
|
|
|
|
## 5. Check ownerships
|
|
|
|
src: [erg_compiler/ownercheck.rs](../../../crates/erg_compiler/ownercheck.rs)
|
|
|
|
## 6. Optimize `HIR`
|
|
|
|
src: [erg_compiler/optimize.rs](../../../crates/erg_compiler/optimize.rs)
|
|
|
|
* Eliminate dead code (unused variables, imports, etc.)
|
|
|
|
## 7. Link
|
|
|
|
src: [erg_compiler/link_hir.rs](../../../crates/erg_compiler/link_hir.rs)
|
|
|
|
* Load all modules, resolve dependencies, and combine into a single HIR
|
|
|
|
## 8. Desugar `HIR`
|
|
|
|
src: [erg_compiler/desugar_hir.rs](../../../crates/erg_compiler/desugar_hir.rs)
|
|
|
|
Convert parts that are not consistent with Python syntax
|
|
|
|
* Convert class member variables to functions
|
|
|
|
## 8. Generate Bytecode (`CodeObj`) from `HIR`
|
|
|
|
src: [erg_compiler/codegen.rs](../../../crates/erg_compiler/codegen.rs)
|