![]() Presently while generalizing type variables, we check variables introduced at a scope for redundancy (whether they are not the root of some unified set of variables). If a variable is redundant, its rank is not adjusted. I believe the current logic to be the following: - Each root of a unification tree will be introduced at some point, exactly once. Its point of introduction will determine the rank of the tree it's the root of - If a variable is redundant, all of its redundant usages must be at the same rank (assuming let generalization proceeds correctly), so there is no need to adjust their rank as well - As such, there is no need to adjust the rank of redundant variables, as a performance optimization. I believe this to be a hold-over from the original version of the solver derived from the elm-compiler. In our implementation however rank adjustment is very cheap (thanks to SoA, ranks are likely in the cache lines already anyway because we just adjusted variables at this point). However, there is a larger problem here - ranks must be adjusted for redundant variables as we begin to support weakened type variables. The motivating case is ``` \x -> when x is _x -> Green ``` we would like this code generalized as `* -> [Green]*`. `when` expressions have each branch solved via let-bindings; in particular, for the singleton branch we introduce `_x` of the appropriate type and solve the body as `[Green]*`. Today, `[Green]*` would be generalized in the context of the inner scope that binds `_x`, which means it is generalized in the body `\x -> ...` as a whole. However, with weakening, we do not want this behavior! In particular, we do not want to actually generalize `_x` in the context of the branch body. Doing so means you could write things like ``` main = \{} -> when Red is x -> y : [Red] y = x z : [Red, Green] z = x {y, z} ``` which is exactly the kind of spurious generalization that the weakening design is trying to avoid. So, we want to introduce `[Green]*` at the rank of the body `\x -> ...`; let's call this `rank_body`, and let's say `[Green]*` is introduced as `branch_var`. Let's say the return type variable is `ret_var`. Now we must be careful. If after unification `ret_var ~ branch_var` we have that `branch_var` becomes the root, then despite `ret_var` (and `branch_var`) being at `rank_body` (which is also the rank that will promoted to generalization), the tree given by `branch_var` won't be generalized, because `ret_var` will be seen as redundant! In fact it is, because `branch_var` was introdued previously, but that doesn't matter - we want the variable to be generalized at the level of the outer let-binding `main = \{} -> ...`. This problem is not unique to when-branches; for example we can observe the same symptom with ``` main = \{} -> x = Green x ``` where here we'd like `x` to not be generalized inside the body of `main`, but have it be generalized relative to the body of `main` (that is, main should have signature `{} -> [Green]*`, but you cannot use `x` itself polymorphically inside the body of `main`). As such, the easiest solution as far as I can see, in the presence of weakening, is to allow rank-adjustment and generalization of redundant variables if they are permitted to be generalized relative to a lower scope. This should preserve soundness; the main source of unsoundness in rank-based let generalization is making sure something like ``` \x -> y = \z -> x z y ``` has type `(a -> b) -> (a -> b)` and not e.g. `(a -> b) -> (c -> d)` due to `x` being instantiated at a higher rank in `y = ...` than it actually is. Note that this change cannot affect this case at all, since we are still doing the rank-adjustment pass at higher ranks, unifying lowers ranked variables to their minimum relative rank, and introduction only happens in the lower-ranked scopes. |
||
---|---|---|
.. | ||
ast | ||
cli | ||
cli_testing_examples | ||
cli_utils | ||
code_markup | ||
compiler | ||
docs | ||
docs_cli | ||
editor | ||
error_macros | ||
glue | ||
highlight | ||
linker | ||
packaging | ||
repl_cli | ||
repl_eval | ||
repl_expect | ||
repl_test | ||
repl_wasm | ||
reporting | ||
roc_std | ||
test_utils | ||
tracing | ||
utils | ||
vendor | ||
wasi-libc-sys | ||
wasm_interp | ||
wasm_module | ||
building_a_roc_application.svg | ||
README.md | ||
roc_compiler_stages.svg |
Roc Internals
Roc has different rust crates for various binaries and libraries. Their roles are briefly described below. If you'd like to learn more, have any questions, or suspect something is out of date, please start a discussion on the Roc Zulip!
You can use cargo doc
to generate docs for a specific package; e.g.
cargo doc --package roc_ast --open
ast/
- roc_ast
Code to represent the Abstract Syntax Tree as used by the editor. In contrast to the compiler, the types in this AST do not keep track of the location of the matching code in the source file.
cli/
- roc_cli
The roc
binary that brings together all functionality in the Roc toolset.
cli_utils/
- cli_utils
Provides shared code for cli tests and benchmarks.
code_markup/
- roc_code_markup
A markup language to display Roc code in the editor.
compiler/
Compiles .roc
files and combines them with their platform into an executable binary. See compiler/README.md for more information.
TODO explain what "compiler frontend" is TODO explain what "compiler backend" is
The compiler includes the following sub-crates;
roc_alias_analysis
Performs analysis and optimizations to remove unneeded reference counts at runtime, and supports in-place mutation.arena_pool
An implementation of an arena allocator designed for the compiler's workloads.roc_build
Responsible for coordinating building and linking of a Roc app with its host.roc_builtins
provides the Roc functions and modules that are implicitly imported into every module. See README.md for more information.roc_can
Canonicalize a roc abstract syntax tree, resolving symbols, re-ordering definitions, and preparing a module for type inference.roc_collections
Domain-specific collections created for the needs of the compiler.roc_constrain
Responsible for building the set of constraints that are used during type inference of a program, and for gathering context needed for pleasant error messages when a type error occurs.roc_debug_flags
Environment variables that can be toggled to aid debugging of the compiler itself.roc_derive
provides auto-derivers for builtin abilities likeHash
andDecode
.roc_exhaustive
provides exhaustiveness checking for Roc.roc_fmt
The roc code formatter.roc_gen_dev
provides the compiler backend to generate Roc binaries fast, for a nice developer experience. See README.md for more information.roc_gen_llvm
provides the LLVM backend to generate Roc binaries. Used to generate a binary with the fastest possible execution speed.roc_gen_wasm
provides the WASM backend to generate Roc binaries. See README.md for more information.roc_ident
Implements data structures used for efficiently representing small strings, like identifiers.roc_intern
provides generic interners for concurrent and single-thread use cases.roc_late_solve
provides type unification and solving primitives from the perspective of the compiler backend.roc_load
Used to load a .roc file and coordinate the compiler pipeline, including parsing, type checking, and code generation.roc_load_internal
The internal implementation of roc_load, separate from roc_load to support caching.roc_module
Implements data structures used for efficiently representing unique modules and identifiers in Roc programs.roc_mono
Roc's main intermediate representation (IR), which is responsible for monomorphization, defunctionalization, inserting ref-count instructions, and transforming a Roc program into a form that is easy to consume by a backend.roc_parse
Implements the Roc parser, which transforms a textual representation of a Roc program to an abstract syntax tree.roc_problem
provides types to describe problems that can occur when compiling.roc
code.roc_region
Data structures for storing source-code-location information, used heavily for contextual error messages.roc_target
provides types and helpers for compiler targets such asdefault_x86_64
.roc_serialize
provides helpers for serializing and deserializing to/from bytes.roc_solve
The entry point of Roc's type inference system. Implements type inference and specialization of abilities.roc_solve_problem
provides types to describe problems that can occur during solving.roc_str
providesRoc
styled collection reference counting. See README.md for more information.test_derive
Tests Roc's auto-derivers.test_gen
contains all of Roc's code generation tests. See README.md for more information.test_mono
Tests Roc's generation of the mono intermediate representation.test_mono_macros
Macros for use intest_mono
.roc_types
Various representations and utilities for dealing with types in the Roc compiler.roc_unify
Implements Roc's unification algorithm, the heartstone of Roc's type inference.
docs/
- roc_docs
Generates html documentation from Roc files. Used for roc-lang.org/builtins/Num.
docs_cli/
- roc_docs_cli
library and roc-docs
binary
Provides a binary that is only used for static build servers.
editor/
- roc_editor
Roc's editor. See README.md for more information.
error_macros/
- roc_error_macros
Provides macros for consistent reporting of errors in Roc's rust code.
glue/
- roc_glue
The roc_glue
crate generates code needed for platform hosts to communicate with Roc apps. This tool is not necessary for writing a platform in another language, however, it's a great convenience! Currently supports Rust platforms, and the plan is to support any language via a plugin model.
highlight/
- roc_highlight
Provides syntax highlighting for the editor by transforming a string to markup nodes.
linker/
- roc_linker
Surgical linker that links platforms to Roc applications. We created our own linker for performance, since regular linkers add complexity that is not needed for linking Roc apps. Because we want roc
to manage the build system and final linking of the executable, it is significantly less practical to use a regular linker. See README.md for more information.
repl_cli/
- roc_repl_cli
Command Line Interface(CLI) functionality for the Read-Evaluate-Print-Loop (REPL).
repl_eval/
- roc_repl_eval
Provides the functionality for the REPL to evaluate Roc expressions.
repl_expect/
- roc_repl_expect
Supports evaluating expect
and printing contextual information when they fail.
repl_test/
- repl_test
Tests the roc REPL.
repl_wasm/
- roc_repl_wasm
Provides a build of the REPL for the Roc website using WebAssembly. See README.md for more information.
reporting/
- roc_reporting
Responsible for generating warning and error messages.
roc_std/
- roc_std
Provides Rust representations of Roc data structures.
test_utils/
- roc_test_utils
Provides testing utility functions for use throughout the Rust code base.
tracing/
- roc_tracing
Provides tracing utility functions for various executable entry points.
utils/
- roc_utils
Provides utility functions used all over the code base.
vendor/
These are files that were originally obtained somewhere else (e.g. crates.io) but which we needed to fork for some Roc-specific reason. See README.md for more information.
wasi-libc-sys/
- wasi_libc_sys
Provides a Rust wrapper for the WebAssembly test platform built on libc and is primarily used for testing purposes.
Building a Roc Application
Below is a simplified diagram to illustrate how a Roc application and host are combined to build an executable file.
Roc Compiler Stages
Below is a simplified diagram to illustrate the different stages of the Roc Compiler.