19 KiB
Roc Interpreter Design: Type-Carrying Architecture
Table of Contents
- Design Goals
- The Fundamental Challenge
- Current Design Problems
- Proposed Architecture: Type-Carrying Interpreter
- Implementation Details
- Examples and Walkthroughs
- Performance Considerations
- Migration Path
Design Goals
The Roc interpreter must satisfy several competing requirements:
-
Unboxed Values: For performance, we need compact runtime representations. An
i16should be 2 bytes on the stack, not a pointer to a heap-allocated object. -
Runtime Polymorphism: Support polymorphic functions without compile-time monomorphization. The function
id = \x -> xshould have one runtime representation that works for all types. -
Cross-Module Polymorphism: Polymorphic functions must work correctly across module boundaries, preserving type relationships.
-
Lazy Evaluation: The interpreter should do work lazily, deferring type specialization until necessary (unlike ahead-of-time monomorphization for optimized builds).
-
Zero-Cost for Monomorphic Code: Code that doesn't use polymorphism shouldn't pay any runtime penalty.
The Fundamental Challenge
The core tension in our design comes from wanting both:
- Concrete, unboxed values for performance
- Abstract type information for polymorphism
Consider this example:
# Module A
id : a -> a
id = \x -> x
# Module B
result = A.id(42) # Should return 42 : i64
The challenge: How does the interpreter know that A.id should return an i64 when called with 42? The function itself is polymorphic - it works for any type. We need to track that "the return type equals the argument type" at runtime.
Current Design Problems
The current interpreter converts types to "layouts" (memory representations) too early:
// Current design
pub const Layout = union(enum) {
i64_layout,
f64_layout,
record_layout: RecordData,
// ... etc
};
pub const StackValue = struct {
ptr: ?*anyopaque,
layout: Layout, // <-- Problem: too concrete!
};
Why Layouts Are Too Concrete
When we convert a type variable to a layout, we lose critical information:
# Type information:
# id : Var(1) -> Var(1)
# (same variable used twice = input type equals output type)
# After conversion to layouts:
# id : ? -> ?
# (lost the connection between input and output!)
Layouts tell us "how many bytes" and "what memory representation" but not "what type relationships exist". This is why cross-module polymorphic functions fail - we can't propagate type constraints through layouts.
The Band-Aid Approach
The current interpreter tries to work around this with ad-hoc solutions:
TypeScopefor tracking some type variable mappingsLayoutWitnessfor carrying extra type information- Special-case handling for certain patterns
These band-aids don't compose well and fail for complex cross-module scenarios.
Proposed Architecture: Type-Carrying Interpreter
Core Concept: Parallel Stacks
Instead of each value carrying a layout, we maintain two parallel stacks:
pub const Interpreter = struct {
// Stack 1: Raw, unboxed values (just bytes)
value_stack: []u8,
// Stack 2: Type information (as type variables)
type_stack: []types.Var,
// Runtime type store (separate from compile-time types)
runtime_types: *types.Store,
// ... other fields
};
Key Invariant: For every value on the value stack, there's a corresponding type variable at the same logical index on the type stack.
How Values and Types Stay Synchronized
// Pushing a value:
fn pushValue(self: *Interpreter, bytes: []const u8, type_var: types.Var) !void {
const value_index = self.value_stack.len;
const type_index = self.type_stack.len;
// These indices must stay synchronized!
assert(value_index == type_index);
self.value_stack.appendSlice(bytes);
self.type_stack.append(type_var);
}
// Getting a value's type:
fn getValueType(self: *Interpreter, value_index: usize) types.Var {
return self.type_stack[value_index];
}
Type Variables Instead of Layouts
The crucial difference is that we keep type variables (which preserve relationships) instead of layouts (which don't):
// OLD: Lost polymorphic relationships
struct OldStackValue {
ptr: ?*anyopaque,
layout: Layout, // "this is 8 bytes, interpreted as i64"
}
// NEW: Preserves polymorphic relationships
struct NewRuntimeValue {
ptr: ?*anyopaque,
type_var: types.Var, // "this is type variable #42, which might be related to other type variables"
}
Implementation Details
The Runtime Type Store
Each interpreter instance has its own runtime type store, separate from the compile-time type store:
pub const RuntimeTypeContext = struct {
// The runtime type store - separate from compile-time types
types: *types.Store,
// Stack of type scopes for nested polymorphic contexts
scopes: []TypeScope,
// Maps polymorphic type variables to their concrete instantiations
instantiation_map: types.VarMap,
// Cache of type variable -> layout conversions
layout_cache: std.AutoHashMap(types.Var, Layout),
};
When We Convert Types to Layouts
We only convert from type variables to layouts when we actually need memory information:
fn getLayoutWhenNeeded(self: *Interpreter, type_var: types.Var) !Layout {
// Check cache first
if (self.layout_cache.get(type_var)) |cached_layout| {
return cached_layout;
}
// Resolve the type variable
const resolved = self.runtime_types.resolveVar(type_var);
// Convert to layout
const layout = try self.typeToLayout(resolved.desc.content);
// Cache for next time
try self.layout_cache.put(type_var, layout);
return layout;
}
Operations that need layouts:
- Memory allocation: How many bytes to reserve?
- Field access: What offset for this field?
- Value copying: How many bytes to copy?
- FFI calls: C functions need concrete types
Operations that DON'T need layouts:
- Function calls: Just pass type variables through
- Local bindings: Just reference stack positions
- Returns: Type variable goes back to caller
- Pattern matching: Compare type variables directly
Polymorphic Function Calls
Here's how a polymorphic function call works in detail:
fn handlePolymorphicCall(self: *Interpreter,
func_type_var: types.Var,
arg_type_vars: []types.Var) !types.Var {
// 1. Get the function's type
const func_resolved = self.runtime_types.resolveVar(func_type_var);
const func_type = func_resolved.desc.content.structure.func;
// 2. Create fresh type variables for this instantiation
const fresh_vars = try self.instantiatePolyVars(func_type);
// 3. Unify argument types with parameter types
for (arg_type_vars, func_type.param_vars) |arg_var, param_var| {
try self.unifyTypes(param_var, arg_var);
}
// 4. The return type is now correctly constrained!
return func_type.return_var;
}
Type Unification at Runtime
We reuse the compile-time unification algorithm, but on the runtime type store:
fn unifyTypes(self: *Interpreter, var1: types.Var, var2: types.Var) !void {
// Use the same unification algorithm as compile-time
const result = try unify.unifyWithContext(
self.module_env,
self.runtime_types, // Runtime types, not compile-time!
self.unify_scratch,
var1,
var2,
false
);
if (!result.isOk()) {
return error.RuntimeTypeError;
}
}
Examples and Walkthroughs
Example 1: Simple Polymorphic Identity
id = \x -> x
result = id(42)
Step-by-step execution:
-
Define
id:- Type:
Var(1) -> Var(1)(input and output share same type variable) - No execution yet (lazy)
- Type:
-
Call
id(42):- Push
42onto value stack (as raw bytes:0x0000002A) - Push
Var(50)onto type stack (whereVar(50) = i64) - Call
id
- Push
-
Inside
id:- Parameter expects
Var(1) - Unify:
Var(1) := Var(50)(nowVar(1)meansi64) - Body just returns the parameter
- Return type is
Var(1), which now resolves toi64
- Parameter expects
-
Return:
- Value stack: still has
0x0000002A - Type stack:
Var(1)(which resolves toi64)
- Value stack: still has
Example 2: Cross-Module Polymorphism
# Module A
getFirst : {first: a, second: b} -> a
getFirst = \record -> record.first
# Module B
point = {first: 42, second: "hello"}
result = A.getFirst(point) # Should return 42
Step-by-step execution:
-
Module B creates
point:- Value stack:
[42, ptr_to_"hello"] - Type stack:
[Var(100), Var(101)]where:Var(100) = i64Var(101) = Str
- Record type:
Var(102) = {first: Var(100), second: Var(101)}
- Value stack:
-
Call
A.getFirst(point):- Push record onto stack
- Push
Var(102)onto type stack
-
Inside
A.getFirst:- Parameter expects
Var(200) = {first: Var(201), second: Var(202)} - Unify
Var(200)withVar(102):Var(201) := Var(100)(first field isi64)Var(202) := Var(101)(second field isStr)
- Return type is
Var(201), which resolves toi64
- Parameter expects
-
Field access
record.first:- Only NOW do we need a layout (for field offset)
- Convert
Var(102)to layout:{first: offset=0, second: offset=8} - Read 8 bytes at offset 0
- Return with type
Var(201)(resolves toi64)
Example 3: Higher-Order Functions
# Module A
apply : (a -> b), a -> b
apply = \f, x -> f(x)
# Module B
double = \n -> n * 2
result = A.apply(double, 21) # Should return 42
Type flow:
doublehas type:Var(300) -> Var(301)with constraintVar(300) = Var(301) = Num *A.applyexpects:(Var(400) -> Var(401)), Var(400) -> Var(401)- Unify at call site:
(Var(400) -> Var(401))with(Var(300) -> Var(301))Var(400)withi64(from literal21)
- Constraints propagate:
Var(400) = Var(300) = i64Var(401) = Var(301) = i64
- Return type
Var(401)resolves toi64
Performance Considerations
Memory Overhead
The type-carrying design adds:
- Type stack: One
Var(4 bytes) per value on the stack - Runtime type store: Grows with program complexity
- Layout cache: Bounded by number of unique types
For typical programs, this overhead is minimal compared to the values themselves.
Performance Optimizations
-
Type Erasure for Primitives:
// Don't create new type vars for known primitives if (isLiteralI64(expr)) { return self.cached_i64_var; // Reuse same var } -
Inline Monomorphic Calls:
// Fast path for non-polymorphic functions if (!func_type.isPolymorphic()) { return self.callMonomorphic(func, args); // Skip unification } -
Layout Cache:
- Cache type variable → layout conversions
- Most types repeat, so cache hit rate is high
-
Lazy Unification:
- Only unify when types actually interact
- Many type variables never need resolution
Comparison with Current Design
| Aspect | Current Design | Type-Carrying Design |
|---|---|---|
| Value representation | Unboxed ✓ | Unboxed ✓ |
| Memory per value | ptr + layout | ptr + type var |
| Polymorphic calls | Often broken ✗ | Always correct ✓ |
| Cross-module | Fails ✗ | Works ✓ |
| Type resolution | At call boundaries | On-demand |
| Complexity | Ad-hoc workarounds | Systematic approach |
Critical Design Issue: Cross-Module Type Variables
The Problem
Type variables are just indices into a type store. When Module B calls a function from Module A:
- Module A's function has type
Var(42) -> Var(42)in Module A's type store - Module B can't just use
Var(42)- that refers to something different in Module B's store!
Solution: Single Runtime Type Store
The key insight is that the runtime has ONE unified type store, not per-module stores:
pub const Interpreter = struct {
// ONE runtime type store for ALL modules
runtime_types: *types.Store,
// Maps (module_id, compile_time_var) -> runtime_var
var_translation: std.AutoHashMap(VarKey, types.Var),
};
const VarKey = struct {
module_id: ModuleId,
compile_var: types.Var,
};
Type Variable Translation
When we encounter a type from any module, we translate it to the runtime store:
fn translateTypeVar(self: *Interpreter,
module: *ModuleEnv,
compile_var: types.Var) !types.Var {
const key = VarKey{
.module_id = module.id,
.compile_var = compile_var
};
// Check if we've already translated this var
if (self.var_translation.get(key)) |runtime_var| {
return runtime_var;
}
// Copy the type structure from module's store to runtime store
const compile_resolved = module.types.resolveVar(compile_var);
const runtime_var = try self.copyTypeToRuntime(compile_resolved.desc.content);
// Cache the translation
try self.var_translation.put(key, runtime_var);
return runtime_var;
}
Deep Copying Type Structure
When translating types, we must recursively translate all referenced type variables:
fn copyTypeToRuntime(self: *Interpreter,
module: *ModuleEnv,
content: types.Content) !types.Var {
switch (content) {
.structure => |s| switch (s) {
.func => |func| {
// Recursively translate parameter and return types
const runtime_params = try self.translateVarList(module, func.params);
const runtime_return = try self.translateTypeVar(module, func.return_var);
// Create new function type in runtime store
return self.runtime_types.mkFunc(runtime_params, runtime_return);
},
.record => |record| {
// Translate all field types
var runtime_fields = ArrayList(RecordField).init(self.allocator);
for (record.fields) |field| {
const runtime_field_type = try self.translateTypeVar(module, field.type_var);
try runtime_fields.append(.{
.name = field.name,
.type_var = runtime_field_type,
});
}
return self.runtime_types.mkRecord(runtime_fields.items);
},
// ... handle other type structures
},
.flex_var => {
// Create a fresh type variable in runtime store
return self.runtime_types.fresh();
},
// ... handle other content types
}
}
Example: Cross-Module Function Call
# Module A (compile-time type store)
id : Var(42) -> Var(42) # in Module A's store
# Module B calling A.id(100)
What happens:
-
Module B prepares to call A.id:
// Translate A.id's type to runtime store const a_id_type = module_a.getFunctionType("id"); // Var(42) -> Var(42) const runtime_type = translateTypeVar(module_a, a_id_type); // Returns fresh Var(1000) -> Var(1000) in runtime store -
Push argument with its runtime type:
value_stack.push(100); // The actual value type_stack.push(Var(2000)); // Fresh var in runtime store for i64 -
Unification happens in runtime store:
// Unify Var(1000) (parameter) with Var(2000) (argument) runtime_types.unify(Var(1000), Var(2000)); // Now Var(1000) = i64 in runtime store -
Return type is correctly resolved:
// Return type is Var(1000), which now resolves to i64
Alternative Design: Module-Local Translation
Another approach would be to keep per-module type stores but maintain translation tables:
pub const InterModuleTypeMap = struct {
// Maps (from_module, from_var, to_module) -> to_var
translations: HashMap(TranslationKey, types.Var),
fn translateVar(self: *@This(),
from: *ModuleEnv,
to: *ModuleEnv,
var: types.Var) !types.Var {
// ... translation logic
}
};
Why we prefer the single runtime store:
- Simpler conceptually (one source of truth)
- No need to translate between every module pair
- Unification naturally works across all modules
- Easier to debug (can dump one store to see all types)
Performance Implications
The translation overhead is minimal:
- Translation happens once per type (cached)
- Most types are monomorphic (direct mapping)
- Polymorphic types are rare (small translation table)
The var_translation cache ensures we don't repeatedly translate the same types.
Migration Path
Phase 1: Add Type Stack
- Add
type_stackalongside existingvalue_stack - Maintain both layouts and type vars temporarily
- Verify type stack stays synchronized with value stack
Phase 2: Runtime Type Store
- Create separate runtime type store
- Populate it during execution
- Add unification for polymorphic calls
Phase 3: Replace Layouts with Type Vars
- Change
StackValueto use type vars instead of layouts - Convert to layouts only when needed
- Remove
LayoutWitnessand other workarounds
Phase 4: Optimization
- Add layout cache
- Implement fast paths for monomorphic code
- Profile and optimize hot paths
Alternative Approaches Considered
1. Full Monomorphization
Idea: Generate specialized versions of polymorphic functions for each type.
Pros:
- Simple implementation
- Fast execution
Cons:
- Code bloat
- Slow compilation
- Not lazy (against our design goals)
2. Boxing Everything
Idea: Make all values pointers to heap objects with type tags.
Pros:
- Simple polymorphism (like OCaml)
- Type info always available
Cons:
- Performance hit from boxing
- Heap allocation pressure
- Against our "unboxed values" goal
3. Monomorphization Cache
Idea: Keep current design but cache (function, arg_types) → return_type.
Pros:
- Minimal changes to current code
- Could work for simple cases
Cons:
- Still doesn't handle complex type relationships
- Cache misses still fail
- Feels like another band-aid
Conclusion
The type-carrying interpreter design solves our polymorphism problems systematically:
- Values remain unboxed for performance
- Type relationships are preserved via type variables
- Cross-module polymorphism works through runtime unification
- Layouts are computed on-demand only when needed
- The design is principled rather than ad-hoc
This architecture mirrors how advanced functional language runtimes work internally (OCaml, Haskell) but adapted for our unboxed value requirement. While more complex than the current design, it's complex in a systematic way that actually solves the problem rather than working around it.
The key insight: Keep type information as long as possible, convert to concrete representations only when absolutely necessary.