mirror of https://github.com/roc-lang/roc.git synced 2025-12-23 08:48:03 +00:00

Richard Feldman 0320d798bd

Interpreter2: closures w/o captures; support calling closure values; add polymorphism tests using Zig multiline strings (Milestone 1).

2025-09-24 12:34:11 -04:00

19 KiB

Raw Blame History

Roc Interpreter Design: Type-Carrying Architecture

Design Goals
The Fundamental Challenge
Current Design Problems
Proposed Architecture: Type-Carrying Interpreter
Implementation Details
Examples and Walkthroughs
Performance Considerations
Migration Path

Design Goals

The Roc interpreter must satisfy several competing requirements:

Unboxed Values: For performance, we need compact runtime representations. An i16 should be 2 bytes on the stack, not a pointer to a heap-allocated object.
Runtime Polymorphism: Support polymorphic functions without compile-time monomorphization. The function id = \x -> x should have one runtime representation that works for all types.
Cross-Module Polymorphism: Polymorphic functions must work correctly across module boundaries, preserving type relationships.
Lazy Evaluation: The interpreter should do work lazily, deferring type specialization until necessary (unlike ahead-of-time monomorphization for optimized builds).
Zero-Cost for Monomorphic Code: Code that doesn't use polymorphism shouldn't pay any runtime penalty.

The Fundamental Challenge

The core tension in our design comes from wanting both:

Concrete, unboxed values for performance
Abstract type information for polymorphism

Consider this example:

# Module A
id : a -> a
id = \x -> x

# Module B
result = A.id(42)  # Should return 42 : i64

The challenge: How does the interpreter know that A.id should return an i64 when called with 42? The function itself is polymorphic - it works for any type. We need to track that "the return type equals the argument type" at runtime.

Current Design Problems

The current interpreter converts types to "layouts" (memory representations) too early:

// Current design
pub const Layout = union(enum) {
    i64_layout,
    f64_layout,
    record_layout: RecordData,
    // ... etc
};

pub const StackValue = struct {
    ptr: ?*anyopaque,
    layout: Layout,  // <-- Problem: too concrete!
};

Why Layouts Are Too Concrete

When we convert a type variable to a layout, we lose critical information:

# Type information:
# id : Var(1) -> Var(1)
# (same variable used twice = input type equals output type)

# After conversion to layouts:
# id : ? -> ?
# (lost the connection between input and output!)

Layouts tell us "how many bytes" and "what memory representation" but not "what type relationships exist". This is why cross-module polymorphic functions fail - we can't propagate type constraints through layouts.

The Band-Aid Approach

The current interpreter tries to work around this with ad-hoc solutions:

TypeScope for tracking some type variable mappings
LayoutWitness for carrying extra type information
Special-case handling for certain patterns

These band-aids don't compose well and fail for complex cross-module scenarios.

Proposed Architecture: Type-Carrying Interpreter

Core Concept: Parallel Stacks

Instead of each value carrying a layout, we maintain two parallel stacks:

pub const Interpreter = struct {
    // Stack 1: Raw, unboxed values (just bytes)
    value_stack: []u8,

    // Stack 2: Type information (as type variables)
    type_stack: []types.Var,

    // Runtime type store (separate from compile-time types)
    runtime_types: *types.Store,

    // ... other fields
};

Key Invariant: For every value on the value stack, there's a corresponding type variable at the same logical index on the type stack.

How Values and Types Stay Synchronized

// Pushing a value:
fn pushValue(self: *Interpreter, bytes: []const u8, type_var: types.Var) !void {
    const value_index = self.value_stack.len;
    const type_index = self.type_stack.len;

    // These indices must stay synchronized!
    assert(value_index == type_index);

    self.value_stack.appendSlice(bytes);
    self.type_stack.append(type_var);
}

// Getting a value's type:
fn getValueType(self: *Interpreter, value_index: usize) types.Var {
    return self.type_stack[value_index];
}

Type Variables Instead of Layouts

The crucial difference is that we keep type variables (which preserve relationships) instead of layouts (which don't):

// OLD: Lost polymorphic relationships
struct OldStackValue {
    ptr: ?*anyopaque,
    layout: Layout,  // "this is 8 bytes, interpreted as i64"
}

// NEW: Preserves polymorphic relationships
struct NewRuntimeValue {
    ptr: ?*anyopaque,
    type_var: types.Var,  // "this is type variable #42, which might be related to other type variables"
}

Implementation Details

The Runtime Type Store

Each interpreter instance has its own runtime type store, separate from the compile-time type store:

pub const RuntimeTypeContext = struct {
    // The runtime type store - separate from compile-time types
    types: *types.Store,

    // Stack of type scopes for nested polymorphic contexts
    scopes: []TypeScope,

    // Maps polymorphic type variables to their concrete instantiations
    instantiation_map: types.VarMap,

    // Cache of type variable -> layout conversions
    layout_cache: std.AutoHashMap(types.Var, Layout),
};

When We Convert Types to Layouts

We only convert from type variables to layouts when we actually need memory information:

fn getLayoutWhenNeeded(self: *Interpreter, type_var: types.Var) !Layout {
    // Check cache first
    if (self.layout_cache.get(type_var)) |cached_layout| {
        return cached_layout;
    }

    // Resolve the type variable
    const resolved = self.runtime_types.resolveVar(type_var);

    // Convert to layout
    const layout = try self.typeToLayout(resolved.desc.content);

    // Cache for next time
    try self.layout_cache.put(type_var, layout);

    return layout;
}

Operations that need layouts:

Memory allocation: How many bytes to reserve?
Field access: What offset for this field?
Value copying: How many bytes to copy?
FFI calls: C functions need concrete types

Operations that DON'T need layouts:

Function calls: Just pass type variables through
Local bindings: Just reference stack positions
Returns: Type variable goes back to caller
Pattern matching: Compare type variables directly

Polymorphic Function Calls

Here's how a polymorphic function call works in detail:

fn handlePolymorphicCall(self: *Interpreter,
                        func_type_var: types.Var,
                        arg_type_vars: []types.Var) !types.Var {
    // 1. Get the function's type
    const func_resolved = self.runtime_types.resolveVar(func_type_var);
    const func_type = func_resolved.desc.content.structure.func;

    // 2. Create fresh type variables for this instantiation
    const fresh_vars = try self.instantiatePolyVars(func_type);

    // 3. Unify argument types with parameter types
    for (arg_type_vars, func_type.param_vars) |arg_var, param_var| {
        try self.unifyTypes(param_var, arg_var);
    }

    // 4. The return type is now correctly constrained!
    return func_type.return_var;
}

Type Unification at Runtime

We reuse the compile-time unification algorithm, but on the runtime type store:

fn unifyTypes(self: *Interpreter, var1: types.Var, var2: types.Var) !void {
    // Use the same unification algorithm as compile-time
    const result = try unify.unifyWithContext(
        self.module_env,
        self.runtime_types,  // Runtime types, not compile-time!
        self.unify_scratch,
        var1,
        var2,
        false
    );

    if (!result.isOk()) {
        return error.RuntimeTypeError;
    }
}

Examples and Walkthroughs

Example 1: Simple Polymorphic Identity

id = \x -> x
result = id(42)

Step-by-step execution:

Define id:
- Type: Var(1) -> Var(1) (input and output share same type variable)
- No execution yet (lazy)
Call id(42):
- Push 42 onto value stack (as raw bytes: 0x0000002A)
- Push Var(50) onto type stack (where Var(50) = i64)
- Call id
Inside id:
- Parameter expects Var(1)
- Unify: Var(1) := Var(50) (now Var(1) means i64)
- Body just returns the parameter
- Return type is Var(1), which now resolves to i64
Return:
- Value stack: still has 0x0000002A
- Type stack: Var(1) (which resolves to i64)

Example 2: Cross-Module Polymorphism

# Module A
getFirst : {first: a, second: b} -> a
getFirst = \record -> record.first

# Module B
point = {first: 42, second: "hello"}
result = A.getFirst(point)  # Should return 42

Step-by-step execution:

Module B creates point:
- Value stack: [42, ptr_to_"hello"]
- Type stack: [Var(100), Var(101)] where:
  - Var(100) = i64
  - Var(101) = Str
- Record type: Var(102) = {first: Var(100), second: Var(101)}
Call A.getFirst(point):
- Push record onto stack
- Push Var(102) onto type stack
Inside A.getFirst:
- Parameter expects Var(200) = {first: Var(201), second: Var(202)}
- Unify Var(200) with Var(102):
  - Var(201) := Var(100) (first field is i64)
  - Var(202) := Var(101) (second field is Str)
- Return type is Var(201), which resolves to i64
Field access record.first:
- Only NOW do we need a layout (for field offset)
- Convert Var(102) to layout: {first: offset=0, second: offset=8}
- Read 8 bytes at offset 0
- Return with type Var(201) (resolves to i64)

Example 3: Higher-Order Functions

# Module A
apply : (a -> b), a -> b
apply = \f, x -> f(x)

# Module B
double = \n -> n * 2
result = A.apply(double, 21)  # Should return 42

Type flow:

double has type: Var(300) -> Var(301) with constraint Var(300) = Var(301) = Num *
A.apply expects: (Var(400) -> Var(401)), Var(400) -> Var(401)
Unify at call site:
- (Var(400) -> Var(401)) with (Var(300) -> Var(301))
- Var(400) with i64 (from literal 21)
Constraints propagate:
- Var(400) = Var(300) = i64
- Var(401) = Var(301) = i64
Return type Var(401) resolves to i64

Performance Considerations

Memory Overhead

The type-carrying design adds:

Type stack: One Var (4 bytes) per value on the stack
Runtime type store: Grows with program complexity
Layout cache: Bounded by number of unique types

For typical programs, this overhead is minimal compared to the values themselves.

Performance Optimizations

Type Erasure for Primitives:

// Don't create new type vars for known primitives
if (isLiteralI64(expr)) {
    return self.cached_i64_var;  // Reuse same var
}

Inline Monomorphic Calls:

// Fast path for non-polymorphic functions
if (!func_type.isPolymorphic()) {
    return self.callMonomorphic(func, args);  // Skip unification
}

Layout Cache:
- Cache type variable → layout conversions
- Most types repeat, so cache hit rate is high
Lazy Unification:
- Only unify when types actually interact
- Many type variables never need resolution

Comparison with Current Design

Aspect	Current Design	Type-Carrying Design
Value representation	Unboxed ✓	Unboxed ✓
Memory per value	ptr + layout	ptr + type var
Polymorphic calls	Often broken ✗	Always correct ✓
Cross-module	Fails ✗	Works ✓
Type resolution	At call boundaries	On-demand
Complexity	Ad-hoc workarounds	Systematic approach

Critical Design Issue: Cross-Module Type Variables

The Problem

Type variables are just indices into a type store. When Module B calls a function from Module A:

Module A's function has type Var(42) -> Var(42) in Module A's type store
Module B can't just use Var(42) - that refers to something different in Module B's store!

Solution: Single Runtime Type Store

The key insight is that the runtime has ONE unified type store, not per-module stores:

pub const Interpreter = struct {
    // ONE runtime type store for ALL modules
    runtime_types: *types.Store,

    // Maps (module_id, compile_time_var) -> runtime_var
    var_translation: std.AutoHashMap(VarKey, types.Var),
};

const VarKey = struct {
    module_id: ModuleId,
    compile_var: types.Var,
};

Type Variable Translation

When we encounter a type from any module, we translate it to the runtime store:

fn translateTypeVar(self: *Interpreter,
                   module: *ModuleEnv,
                   compile_var: types.Var) !types.Var {
    const key = VarKey{
        .module_id = module.id,
        .compile_var = compile_var
    };

    // Check if we've already translated this var
    if (self.var_translation.get(key)) |runtime_var| {
        return runtime_var;
    }

    // Copy the type structure from module's store to runtime store
    const compile_resolved = module.types.resolveVar(compile_var);
    const runtime_var = try self.copyTypeToRuntime(compile_resolved.desc.content);

    // Cache the translation
    try self.var_translation.put(key, runtime_var);

    return runtime_var;
}

Deep Copying Type Structure

When translating types, we must recursively translate all referenced type variables:

fn copyTypeToRuntime(self: *Interpreter,
                     module: *ModuleEnv,
                     content: types.Content) !types.Var {
    switch (content) {
        .structure => |s| switch (s) {
            .func => |func| {
                // Recursively translate parameter and return types
                const runtime_params = try self.translateVarList(module, func.params);
                const runtime_return = try self.translateTypeVar(module, func.return_var);

                // Create new function type in runtime store
                return self.runtime_types.mkFunc(runtime_params, runtime_return);
            },
            .record => |record| {
                // Translate all field types
                var runtime_fields = ArrayList(RecordField).init(self.allocator);
                for (record.fields) |field| {
                    const runtime_field_type = try self.translateTypeVar(module, field.type_var);
                    try runtime_fields.append(.{
                        .name = field.name,
                        .type_var = runtime_field_type,
                    });
                }
                return self.runtime_types.mkRecord(runtime_fields.items);
            },
            // ... handle other type structures
        },
        .flex_var => {
            // Create a fresh type variable in runtime store
            return self.runtime_types.fresh();
        },
        // ... handle other content types
    }
}

Example: Cross-Module Function Call

# Module A (compile-time type store)
id : Var(42) -> Var(42)  # in Module A's store

# Module B calling A.id(100)

What happens:

Module B prepares to call A.id:

// Translate A.id's type to runtime store
const a_id_type = module_a.getFunctionType("id");  // Var(42) -> Var(42)
const runtime_type = translateTypeVar(module_a, a_id_type);
// Returns fresh Var(1000) -> Var(1000) in runtime store

Push argument with its runtime type:

value_stack.push(100);  // The actual value
type_stack.push(Var(2000));  // Fresh var in runtime store for i64

Unification happens in runtime store:

// Unify Var(1000) (parameter) with Var(2000) (argument)
runtime_types.unify(Var(1000), Var(2000));
// Now Var(1000) = i64 in runtime store

Return type is correctly resolved:

// Return type is Var(1000), which now resolves to i64

Alternative Design: Module-Local Translation

Another approach would be to keep per-module type stores but maintain translation tables:

pub const InterModuleTypeMap = struct {
    // Maps (from_module, from_var, to_module) -> to_var
    translations: HashMap(TranslationKey, types.Var),

    fn translateVar(self: *@This(),
                   from: *ModuleEnv,
                   to: *ModuleEnv,
                   var: types.Var) !types.Var {
        // ... translation logic
    }
};

Why we prefer the single runtime store:

Simpler conceptually (one source of truth)
No need to translate between every module pair
Unification naturally works across all modules
Easier to debug (can dump one store to see all types)

Performance Implications

The translation overhead is minimal:

Translation happens once per type (cached)
Most types are monomorphic (direct mapping)
Polymorphic types are rare (small translation table)

The var_translation cache ensures we don't repeatedly translate the same types.

Migration Path

Phase 1: Add Type Stack

Add type_stack alongside existing value_stack
Maintain both layouts and type vars temporarily
Verify type stack stays synchronized with value stack

Phase 2: Runtime Type Store

Create separate runtime type store
Populate it during execution
Add unification for polymorphic calls

Phase 3: Replace Layouts with Type Vars

Change StackValue to use type vars instead of layouts
Convert to layouts only when needed
Remove LayoutWitness and other workarounds

Phase 4: Optimization

Add layout cache
Implement fast paths for monomorphic code
Profile and optimize hot paths

Alternative Approaches Considered

1. Full Monomorphization

Idea: Generate specialized versions of polymorphic functions for each type.

Pros:

Simple implementation
Fast execution

Cons:

Code bloat
Slow compilation
Not lazy (against our design goals)

2. Boxing Everything

Idea: Make all values pointers to heap objects with type tags.

Pros:

Simple polymorphism (like OCaml)
Type info always available

Cons:

Performance hit from boxing
Heap allocation pressure
Against our "unboxed values" goal

3. Monomorphization Cache

Idea: Keep current design but cache (function, arg_types) → return_type.

Pros:

Minimal changes to current code
Could work for simple cases

Cons:

Still doesn't handle complex type relationships
Cache misses still fail
Feels like another band-aid

Conclusion

The type-carrying interpreter design solves our polymorphism problems systematically:

Values remain unboxed for performance
Type relationships are preserved via type variables
Cross-module polymorphism works through runtime unification
Layouts are computed on-demand only when needed
The design is principled rather than ad-hoc

This architecture mirrors how advanced functional language runtimes work internally (OCaml, Haskell) but adapted for our unboxed value requirement. While more complex than the current design, it's complex in a systematic way that actually solves the problem rather than working around it.

The key insight: Keep type information as long as possible, convert to concrete representations only when absolutely necessary.

19 KiB Raw Blame History

Roc Interpreter Design: Type-Carrying Architecture

Table of Contents

Design Goals

The Fundamental Challenge

Current Design Problems

Why Layouts Are Too Concrete

The Band-Aid Approach

Proposed Architecture: Type-Carrying Interpreter

Core Concept: Parallel Stacks

How Values and Types Stay Synchronized

Type Variables Instead of Layouts

Implementation Details

The Runtime Type Store

When We Convert Types to Layouts

Polymorphic Function Calls

Type Unification at Runtime

Examples and Walkthroughs

Example 1: Simple Polymorphic Identity

Example 2: Cross-Module Polymorphism

Example 3: Higher-Order Functions

Performance Considerations

Memory Overhead

Performance Optimizations

Comparison with Current Design

Critical Design Issue: Cross-Module Type Variables

The Problem

Solution: Single Runtime Type Store

Type Variable Translation

Deep Copying Type Structure

Example: Cross-Module Function Call

Alternative Design: Module-Local Translation

Performance Implications

Migration Path

Phase 1: Add Type Stack

Phase 2: Runtime Type Store

Phase 3: Replace Layouts with Type Vars

Phase 4: Optimization

Alternative Approaches Considered

1. Full Monomorphization

2. Boxing Everything

3. Monomorphization Cache

Conclusion

19 KiB

Raw Blame History