Merge pull request #8508 from roc-lang/more-fixxes

Fix Iterator Invalidation in Type Instantiation
This commit is contained in:
Luke Boswell 2025-12-01 17:32:52 +11:00 committed by GitHub
commit 4a48ec9718
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
18 changed files with 3092 additions and 328 deletions

View file

@ -522,6 +522,50 @@ Another approach, manual memory management, would allow you to produce the faste
Reference counting implementation:
- Old compiler: [Mono folder](crates/compiler/mono/src) (search ref)
## Borrow
A **borrowing** function reads its argument without affecting its reference count.
The caller retains ownership and can continue using the value after the call.
Example builtins that borrow: `strEqual`, `listLen`, `strContains`
See [src/builtins/OWNERSHIP.md](src/builtins/OWNERSHIP.md) for detailed ownership semantics.
## Consume
A **consuming** function takes ownership of its argument. The caller transfers
ownership to the callee and must not use the argument after the call. The function
is responsible for cleanup (decref when done).
Example builtins that consume: `strConcat`, `listConcat`, `strJoinWith`
See [src/builtins/OWNERSHIP.md](src/builtins/OWNERSHIP.md) for detailed ownership semantics.
## Copy-on-Write
A variant of consuming where the function may return the same allocation if the
input is unique (reference count == 1). If the input is shared, the function
decrefs the original and allocates a new copy.
Example builtins: `strWithAsciiUppercased`, `strTrim`, `listAppend`
## Seamless Slice
A memory optimization where the result shares underlying data with the input
via a slice that holds a reference to the original allocation.
There are two variants:
1. **Borrowing seamless slice**: The builtin borrows the input and calls `incref`
to share the allocation. The interpreter should decref the input after the call.
Example: `strToUtf8`
2. **Consuming seamless slice**: The builtin consumes the input and the slice
inherits the reference (no incref). The interpreter should NOT decref.
Example: `strTrim` (when it creates an offset slice)
See [src/builtins/OWNERSHIP.md](src/builtins/OWNERSHIP.md) for detailed ownership semantics.
## Mutate in place
TODO

225
src/builtins/OWNERSHIP.md Normal file
View file

@ -0,0 +1,225 @@
# Ownership Semantics in Roc Builtins
This document defines the canonical terminology for ownership semantics in Roc's builtin functions.
Understanding these patterns is critical for correctly implementing and calling builtins.
## Core Invariant
**refcount = number of live references to the data**
When refcount reaches 0, memory is freed.
Basic operations:
1. **Create**: allocate with refcount = 1
2. **Share**: increment refcount (data is shared, both references valid)
3. **Release**: decrement refcount (if 0, free memory)
---
## Argument Handling (2 patterns)
These describe how a function treats its input arguments.
### Borrow
A **borrowing** function reads its arguments without affecting their refcount.
- Caller retains ownership
- No refcount change at call boundary
- Caller can still use argument after call
**Examples**: `strEqual`, `listLen`, `strContains`, `countUtf8Bytes`
### Consume
A **consuming** function takes ownership of its argument.
- Caller transfers ownership to callee
- Caller must not use argument after call (logically moved)
- Function is responsible for cleanup (decref when done)
**Examples**: `strConcat`, `listConcat`, `strJoinWith`
---
## Result Patterns (3 types)
These describe the relationship between a function's result and its arguments.
### Independent
Result is a new allocation, unrelated to arguments.
- Normal ownership: caller owns result
- Caller must decref when done
**Example**: `strConcat` returns newly allocated combined string
### Copy-on-Write (Same-if-unique)
Result may be the same allocation as an argument.
- Consumes the input argument
- If `isUnique()`: mutates in place, returns same pointer
- If shared: decrefs argument internally, allocates new, returns new pointer
- **Critical**: `result.ptr` may equal `arg.ptr`
**Examples**: `strWithAsciiUppercased`, `strTrim`, `listAppend`
**Interpreter handling**: Check if `result.bytes == arg.bytes`:
- If same: skip decref (ownership passed through)
- If different: builtin already decrefd internally
### Seamless Slice
Result shares underlying data with argument via seamless slice.
- Borrows argument (caller keeps ownership)
- Builtin calls `incref` internally to share the allocation
- Result points into argument's memory
- `SEAMLESS_SLICE_BIT` marks the slice in length field
**Examples**: `strToUtf8`, `substringUnsafe`, `listSublist`
**Interpreter handling**: Decref the argument after call (builtin only incref'd for sharing;
the original binding's reference must still be released)
---
## Complete Taxonomy
The key insight is that seamless slices can be created by either borrowing OR consuming functions:
| Pattern | Arg Handling | Result Type | Interpreter After Call |
|---------|--------------|-------------|------------------------|
| Pure borrow | Borrow | Independent | Decref arg |
| Borrowing seamless slice | Borrow | Slice (incref'd) | Decref arg |
| Pure consume | Consume | Independent | **Don't decref** |
| Copy-on-write | Consume | Same-if-unique | **Don't decref** |
| Consuming seamless slice | Consume | Slice (inherited) | **Don't decref** |
**Simple rule**: Decref if and only if the builtin **borrows**. Never decref for **consume**.
### Why This Matters
For **borrowing** seamless slice (e.g., `strToUtf8`):
- Builtin calls `incref` to share the allocation
- Caller still has their reference
- Interpreter must decref (release the borrowed copy)
For **consuming** seamless slice (e.g., `strTrim` with offset):
- Builtin does NOT call `incref`
- Slice inherits the caller's reference
- Interpreter must NOT decref (ownership transferred)
### Copy-on-Write Detail
For copy-on-write builtins like `strWithAsciiUppercased`:
- If input is **unique**: mutates in place, returns same pointer
- If input is **shared**: builtin decrefs internally, allocates new
In BOTH cases, the interpreter should NOT decref:
- Unique case: ownership passed through to result
- Shared case: builtin already handled the decref
The previous heuristic (`result.bytes == arg.bytes`) was incomplete—it missed
the shared case where the builtin decrefs internally.
---
## Interpreter Contract
The interpreter uses ownership metadata per builtin:
| Argument Type | Interpreter Action After Call |
|---------------|-------------------------------|
| **Borrow** | Decref argument (release our copy) |
| **Consume** | Don't decref (ownership transferred to builtin) |
This requires each low-level op to declare whether each argument is borrowed or consumed.
See `src/check/lower_ops.zig` for the ownership metadata.
---
## Standard Terminology
| Term | Definition |
|------|------------|
| **Borrow** | Function reads argument without affecting refcount. Caller retains ownership. |
| **Consume** | Function takes ownership of argument. Caller loses access. Function handles cleanup. |
| **Copy-on-Write** | Consume variant: if unique, returns same allocation; if shared, decrefs and allocates new. |
| **Seamless Slice** | Result shares underlying data with argument. Builtin calls incref internally. |
| **Own** | The entity responsible for eventually calling decref. |
| **Unique** | Refcount == 1. Safe to mutate in place. |
---
## Function Documentation Format
Every builtin should document its ownership semantics:
```zig
/// Brief description of what the function does.
///
/// ## Ownership
/// - `arg1`: **consumes** - caller loses ownership
/// - `arg2`: **borrows** - caller retains ownership
/// - Returns: **independent** / **copy-on-write** / **seamless-slice**
///
/// ## Notes
/// Additional implementation details relevant to callers.
pub fn exampleFunction(...) ReturnType { ... }
```
---
## Function Categories
### str.zig
**Borrow args, Independent result:**
- `strEqual` - borrows both → Bool
- `strContains` - borrows both → Bool
- `strStartsWith` / `strEndsWith` - borrows both → Bool
- `strNumberOfBytes` / `countUtf8Bytes` - borrows → U64
**Consume arg, Copy-on-Write result:**
- `strWithAsciiUppercased` - consumes → Str (same-if-unique)
- `strWithAsciiLowercased` - consumes → Str (same-if-unique)
**Consume arg, Seamless-slice OR Copy-on-Write result:**
- `strTrim` / `strTrimStart` / `strTrimEnd` - consumes → Str
- If unique with no offset needed: shrinks in place (copy-on-write)
- Otherwise: creates consuming seamless slice (inherits reference)
**Consume args, Independent result:**
- `strConcat` - consumes first, borrows second → new Str
- `strJoinWith` - consumes list, borrows separator → new Str
**Borrow arg, Seamless-slice result (incref'd):**
- `strToUtf8` - borrows → List (seamless slice, calls incref)
- `strDropPrefix` / `strDropSuffix` - borrows → Str (seamless slice or incref'd original)
**Borrow arg, Seamless-slice result (no incref - caller must handle):**
- `substringUnsafe` - borrows → Str (seamless slice, NO incref!)
- **WARNING**: Caller is responsible for refcount management
### list.zig
**Borrow args, Independent result:**
- `listLen` - borrows → U64
- `listIsEmpty` - borrows → Bool
- `listGetUnsafe` - borrows → pointer (no ownership transfer)
**Consume args, Copy-on-Write result:**
- `listAppend` - consumes list, borrows element → List (same-if-unique)
- `listPrepend` - consumes list, borrows element → List (same-if-unique)
**Consume args, Independent result:**
- `listConcat` - consumes both → new List
- `listMap` / `listKeepIf` / `listDropIf` - consumes → new List
**Consume arg, Seamless-slice OR Copy-on-Write result:**
- `listSublist` - consumes → List
- If unique at start: shrinks in place (copy-on-write)
- Otherwise: creates consuming seamless slice (inherits reference)

View file

@ -2,6 +2,18 @@
//!
//! Lists use copy-on-write semantics to minimize allocations when shared across contexts.
//! Seamless slice optimization reduces memory overhead for substring operations.
//!
//! ## Ownership Semantics
//!
//! See `OWNERSHIP.md` for the canonical terminology. Functions in this module
//! follow these patterns:
//!
//! - **Borrow**: Function reads argument, caller retains ownership
//! - **Consume**: Function takes ownership, caller loses access
//! - **Copy-on-Write**: Consumes arg; if unique, mutates in place; if shared, allocates new
//! - **Seamless Slice**: Result shares data with arg via incref'd slice
//!
//! Each function documents its ownership semantics in its doc comment.
const std = @import("std");
const utils = @import("utils.zig");
@ -550,7 +562,15 @@ pub fn listAppendUnsafe(
return output;
}
/// Add element to end of list. Will reserve additional space or reallocate if necessary beforehand.
/// List.append - adds an element to the end of a list.
///
/// ## Ownership
/// - `list`: **consumes** - caller loses ownership
/// - `element`: **borrows** - copied into list, caller retains original
/// - Returns: **copy-on-write** - may be same allocation if unique with capacity
///
/// Reserves capacity if needed, then appends element. If the list is unique
/// with sufficient capacity, modifies in place and returns same pointer.
pub fn listAppend(
list: RocList,
alignment: u32,
@ -686,7 +706,16 @@ pub fn shallowClone(
return new_list;
}
/// Add element to beginning of list, shifting existing elements.
/// List.prepend - adds an element to the beginning of a list.
///
/// ## Ownership
/// - `list`: **consumes** - caller loses ownership
/// - `element`: **borrows** - copied into list, caller retains original
/// - Returns: **copy-on-write** - may be same allocation if unique with capacity
///
/// Reserves capacity if needed, shifts existing elements, then inserts element
/// at the front. If the list is unique with sufficient capacity, modifies in
/// place and returns same pointer.
pub fn listPrepend(
list: RocList,
alignment: u32,
@ -782,7 +811,20 @@ pub fn listSwap(
return newList;
}
/// Returns a sublist of the given list
/// List.sublist - returns a sublist of the given list.
///
/// ## Ownership
/// - `list`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** or **seamless-slice** depending on input
///
/// If list is empty, or sublist range is empty/out-of-bounds:
/// - If unique: shrinks length to 0, returns same allocation
/// - Otherwise: decrefs original, returns empty list
///
/// If sublist starts at index 0 and list is unique:
/// - Shrinks length in place, returns same allocation
///
/// Otherwise: creates a seamless slice pointing into the original allocation.
pub fn listSublist(
list: RocList,
alignment: u32,
@ -1086,12 +1128,16 @@ fn swapElements(
return swap(element_width, element_at_i, element_at_j, copy);
}
/// Concatenates two lists into a new list containing all elements from both lists.
/// List.concat - concatenates two lists into one.
///
/// ## Ownership and Memory Management
/// **IMPORTANT**: This function CONSUMES both input lists (`list_a` and `list_b`).
/// The caller must NOT call `decref` on either input list after calling this function,
/// as this function handles their cleanup internally.
/// ## Ownership
/// - `list_a`: **consumes** - caller loses ownership
/// - `list_b`: **consumes** - caller loses ownership
/// - Returns: **independent** or **copy-on-write** - new allocation or extended list_a
///
/// This function handles cleanup of both input lists internally.
/// If list_a has capacity, may extend it and return (copy-on-write).
/// Otherwise allocates new list containing elements from both.
pub fn listConcat(
list_a: RocList,
list_b: RocList,

View file

@ -4,6 +4,18 @@
//! operations for string manipulation, Unicode handling, formatting, and
//! memory management. It defines the RocStr structure and associated functions
//! that are called from compiled Roc code to handle string operations efficiently.
//!
//! ## Ownership Semantics
//!
//! See `OWNERSHIP.md` for the canonical terminology. Functions in this module
//! follow these patterns:
//!
//! - **Borrow**: Function reads argument, caller retains ownership
//! - **Consume**: Function takes ownership, caller loses access
//! - **Copy-on-Write**: Consumes arg; if unique, mutates in place; if shared, allocates new
//! - **Seamless Slice**: Result shares data with arg via incref'd slice
//!
//! Each function documents its ownership semantics in its doc comment.
const std = @import("std");
const RocList = @import("list.zig").RocList;
@ -18,6 +30,17 @@ const unicode = std.unicode;
const testing = std.testing;
const rcNone = @import("utils.zig").rcNone;
/// Decref function for RocStr elements in a list.
/// Used when decref-ing a List Str - each string element needs to be decreffed.
/// The context parameter is expected to be a *RocOps.
fn strDecref(context: ?*anyopaque, element: ?[*]u8) callconv(.c) void {
if (element) |elem_ptr| {
const str_ptr: *RocStr = @ptrCast(@alignCast(elem_ptr));
const roc_ops: *RocOps = @ptrCast(@alignCast(context.?));
str_ptr.decref(roc_ops);
}
}
const InPlace = enum(u8) {
InPlace,
Clone,
@ -667,7 +690,20 @@ pub fn getCapacity(string: RocStr) callconv(.c) usize {
return string.getCapacity();
}
/// TODO: Document substringUnsafeC.
/// Str.substring - extracts a substring without bounds checking.
///
/// ## Ownership
/// - `string`: **borrows** - caller retains ownership
/// - Returns: **seamless-slice** - shares data with input string
///
/// **IMPORTANT**: This function does NOT call incref. The returned seamless
/// slice shares the input's allocation, but the caller is responsible for
/// ensuring the refcount is correct. This is typically used internally where
/// the caller handles refcount management.
///
/// For small strings: creates a new small string (copy).
/// For heap strings at start=0 with unique refcount: shrinks in place.
/// Otherwise: creates a seamless slice pointing into the original string.
pub fn substringUnsafeC(
string: RocStr,
start_u64: u64,
@ -680,7 +716,7 @@ pub fn substringUnsafeC(
return substringUnsafe(string, start, length, roc_ops);
}
/// TODO
/// See substringUnsafeC for ownership documentation.
pub fn substringUnsafe(
string: RocStr,
start: usize,
@ -747,7 +783,15 @@ pub fn startsWith(string: RocStr, prefix: RocStr) callconv(.c) bool {
return true;
}
/// Str.drop_prefix - Returns string with prefix removed, or original if no match
/// Str.drop_prefix - Returns string with prefix removed, or original if no match.
///
/// ## Ownership
/// - `string`: **borrows** - caller retains ownership
/// - `prefix`: **borrows** - caller retains ownership
/// - Returns: **seamless-slice** - shares data with input string (incref'd)
///
/// If prefix doesn't match, returns the original string with refcount incremented.
/// If prefix matches, returns a seamless slice of the remaining portion.
pub fn strDropPrefix(
string: RocStr,
prefix: RocStr,
@ -765,7 +809,15 @@ pub fn strDropPrefix(
return substringUnsafe(string, prefix_len, new_len, roc_ops);
}
/// Str.drop_suffix - Returns string with suffix removed, or original if no match
/// Str.drop_suffix - Returns string with suffix removed, or original if no match.
///
/// ## Ownership
/// - `string`: **borrows** - caller retains ownership
/// - `suffix`: **borrows** - caller retains ownership
/// - Returns: **seamless-slice** - shares data with input string (incref'd)
///
/// If suffix doesn't match, returns the original string with refcount incremented.
/// If suffix matches, returns a seamless slice of the remaining portion.
pub fn strDropSuffix(
string: RocStr,
suffix: RocStr,
@ -829,7 +881,15 @@ pub fn endsWith(string: RocStr, suffix: RocStr) callconv(.c) bool {
return true;
}
/// Str.concat
/// Str.concat - concatenates two strings.
///
/// ## Ownership
/// - `arg1`: **consumes** - may be reallocated if capacity insufficient
/// - `arg2`: **borrows** - caller retains ownership (not decrefd here)
/// - Returns: **independent** or **copy-on-write** depending on arg1's capacity
///
/// Note: arg1 is owned and may be returned directly if arg2 is empty,
/// or reallocated to accommodate the combined content.
pub fn strConcatC(
arg1: RocStr,
arg2: RocStr,
@ -838,7 +898,7 @@ pub fn strConcatC(
return @call(.always_inline, strConcat, .{ arg1, arg2, roc_ops });
}
/// TODO
/// See strConcatC for ownership documentation.
pub fn strConcat(
arg1: RocStr,
arg2: RocStr,
@ -871,7 +931,12 @@ pub const RocListStr = extern struct {
list_capacity_or_alloc_ptr: usize,
};
/// Str.joinWith
/// Str.joinWith - joins a list of strings with a separator.
///
/// ## Ownership
/// - `list`: **consumes** - elements are borrowed, list is consumed
/// - `separator`: **borrows** - caller retains ownership
/// - Returns: **independent** - new allocation containing joined result
pub fn strJoinWithC(
list: RocList,
separator: RocStr,
@ -883,10 +948,16 @@ pub fn strJoinWithC(
.list_capacity_or_alloc_ptr = list.capacity_or_alloc_ptr,
};
return @call(.always_inline, strJoinWith, .{ roc_list_str, separator, roc_ops });
const result = @call(.always_inline, strJoinWith, .{ roc_list_str, separator, roc_ops });
// Decref the consumed list. Since elements are strings (refcounted), we pass
// elements_refcounted=true and provide strDecref to decref each element.
list.decref(@alignOf(RocStr), @sizeOf(RocStr), true, @ptrCast(roc_ops), &strDecref, roc_ops);
return result;
}
/// TODO
/// See strJoinWithC for ownership documentation.
pub fn strJoinWith(
list: RocListStr,
separator: RocStr,
@ -928,7 +999,18 @@ pub fn strJoinWith(
}
}
/// Str.toUtf8
/// Str.toUtf8 - converts a string to a list of UTF-8 bytes.
///
/// ## Ownership
/// - `arg`: **borrows** - caller retains ownership
/// - Returns: **seamless-slice** - shares underlying data with input string
///
/// For heap strings, the returned list shares the same underlying allocation
/// as the input string. This function calls `incref` on the allocation to
/// account for the new reference. Small strings are copied to a new allocation.
///
/// The caller must decref the argument after call (we borrowed it but added
/// a reference to its data via the returned list).
pub fn strToUtf8C(
arg: RocStr,
roc_ops: *RocOps,
@ -950,6 +1032,9 @@ inline fn strToBytes(
return RocList{ .length = length, .bytes = ptr, .capacity_or_alloc_ptr = length };
} else {
// The returned list shares the same underlying allocation as the string.
// We must incref the allocation since there's now an additional reference to it.
arg.incref(1);
const is_seamless_slice = arg.length & SEAMLESS_SLICE_BIT;
return RocList{ .length = length, .bytes = arg.bytes, .capacity_or_alloc_ptr = arg.capacity_or_alloc_ptr | is_seamless_slice };
}
@ -1050,6 +1135,8 @@ pub fn fromUtf8Lossy(
roc_ops: *RocOps,
) callconv(.c) RocStr {
if (list.len() == 0) {
// Free the empty list since we consume ownership
list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, roc_ops);
return RocStr.empty();
}
@ -1070,6 +1157,10 @@ pub fn fromUtf8Lossy(
end_index += utf8EncodeLossy(c, ptr[end_index..]);
}
str.setLen(end_index);
// Free the input list since we consume ownership
list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, roc_ops);
return str;
}
@ -1283,7 +1374,17 @@ pub fn isWhitespace(codepoint: u21) bool {
};
}
/// TODO: Document strTrim.
/// Str.trim - removes leading and trailing whitespace.
///
/// ## Ownership
/// - `input_string`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** or **seamless-slice** depending on input
///
/// Behavior depends on input state:
/// - Empty string: returns empty (decrefs input if heap-allocated)
/// - Small string: creates new small string with trimmed bytes
/// - Unique with no leading whitespace: shrinks in place (same allocation)
/// - Otherwise: creates seamless slice pointing to trimmed region
pub fn strTrim(
input_string: RocStr,
roc_ops: *RocOps,
@ -1336,7 +1437,17 @@ pub fn strTrim(
}
}
/// TODO: Document strTrimStart.
/// Str.trim_start - removes leading whitespace.
///
/// ## Ownership
/// - `input_string`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** or **seamless-slice** depending on input
///
/// Behavior depends on input state:
/// - Empty string: returns empty (decrefs input if heap-allocated)
/// - Small string: creates new small string with trimmed bytes
/// - Unique with no leading whitespace: returns same allocation unchanged
/// - Otherwise: creates seamless slice pointing to trimmed region
pub fn strTrimStart(
input_string: RocStr,
roc_ops: *RocOps,
@ -1388,7 +1499,17 @@ pub fn strTrimStart(
}
}
/// TODO: Document strTrimEnd.
/// Str.trim_end - removes trailing whitespace.
///
/// ## Ownership
/// - `input_string`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** - may be same allocation if unique
///
/// Behavior depends on input state:
/// - Empty string: returns empty (decrefs input if heap-allocated)
/// - Small string: creates new small string with trimmed bytes
/// - Unique: shrinks length in place (same allocation)
/// - Shared: creates seamless slice pointing to trimmed region
pub fn strTrimEnd(
input_string: RocStr,
roc_ops: *RocOps,
@ -1473,6 +1594,15 @@ fn countTrailingWhitespaceBytes(string: RocStr) usize {
}
/// Str.with_ascii_lowercased
///
/// Returns a string with all ASCII letters converted to lowercase.
///
/// ## Ownership
/// - `string`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** - may be same allocation if input was unique
///
/// If the input string is unique, modifies in place and returns it.
/// If shared, decrefs the input and allocates a new string.
pub fn strWithAsciiLowercased(
string: RocStr,
roc_ops: *RocOps,
@ -1492,6 +1622,15 @@ pub fn strWithAsciiLowercased(
}
/// Str.with_ascii_uppercased
///
/// Returns a string with all ASCII letters converted to uppercase.
///
/// ## Ownership
/// - `string`: **consumes** - caller loses ownership
/// - Returns: **copy-on-write** - may be same allocation if input was unique
///
/// If the input string is unique, modifies in place and returns it.
/// If shared, decrefs the input and allocates a new string.
pub fn strWithAsciiUppercased(
string: RocStr,
roc_ops: *RocOps,
@ -2550,8 +2689,8 @@ test "fromUtf8Lossy: ascii, emoji" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var list = RocList.fromSlice(u8, "r💖c", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r💖c", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());
@ -2761,8 +2900,8 @@ test "fromUtf8Lossy: invalid start byte" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var list = RocList.fromSlice(u8, "r\x80c", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r\x80c", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());
@ -2775,8 +2914,8 @@ test "fromUtf8Lossy: overlong encoding" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var list = RocList.fromSlice(u8, "r\xF0\x9F\x92\x96\x80c", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r\xF0\x9F\x92\x96\x80c", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());
@ -2789,8 +2928,8 @@ test "fromUtf8Lossy: expected continuation" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var list = RocList.fromSlice(u8, "r\xCFc", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r\xCFc", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());
@ -2803,8 +2942,8 @@ test "fromUtf8Lossy: unexpected end" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var list = RocList.fromSlice(u8, "r\xCF", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r\xCF", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());
@ -2822,8 +2961,8 @@ test "fromUtf8Lossy: encodes surrogate" {
// becomes 0b1110_1101 0b10_1000_00 0b10_11_1101
// 1110_wwww 10_xxxx_yy 10_yy_zzzz
// 0xED 0x90 0xBD
var list = RocList.fromSlice(u8, "r\xED\xA0\xBDc", false, test_env.getOps());
defer list.decref(@alignOf(u8), @sizeOf(u8), false, null, &rcNone, test_env.getOps());
const list = RocList.fromSlice(u8, "r\xED\xA0\xBDc", false, test_env.getOps());
// fromUtf8Lossy consumes ownership of the list - no manual decref needed
const res = fromUtf8Lossy(list, test_env.getOps());
defer res.decref(test_env.getOps());

View file

@ -202,6 +202,16 @@ pub const Dec = fn (?[*]u8) callconv(.c) void;
/// - It makes the "constant" check very efficient
/// - It's safe since normal refcounts should never reach 0 while still being referenced
pub const REFCOUNT_STATIC_DATA: isize = 0;
/// Sentinel value written to freed refcount slots to detect use-after-free.
/// When memory is freed in debug mode, the refcount slot is poisoned with this value.
/// Any subsequent attempt to incref/decref this memory will trigger a panic.
/// This value is only used in debug builds and has zero overhead in release.
/// Uses a recognizable pattern that works on both 32-bit and 64-bit platforms.
const POISON_VALUE: isize = @bitCast(if (@sizeOf(usize) == 8)
@as(usize, 0xDEADBEEFDEADBEEF)
else
@as(usize, 0xDEADBEEF));
/// No-op reference count decrement function.
/// Used as a callback when elements don't contain refcounted data or in testing scenarios
/// where reference counting operations should be skipped. Matches the `Dec` function type
@ -241,6 +251,17 @@ pub fn increfRcPtrC(ptr_to_refcount: *isize, amount: isize) callconv(.c) void {
// Ensure that the refcount is not whole program lifetime.
const refcount: isize = ptr_to_refcount.*;
// Debug-only assertions to catch refcount bugs early.
if (builtin.mode == .Debug) {
if (refcount == POISON_VALUE) {
@panic("Use-after-free: incref on already-freed memory");
}
if (refcount <= 0 and !rcConstant(refcount)) {
@panic("Invalid incref: incrementing non-positive refcount");
}
}
if (!rcConstant(refcount)) {
// Note: we assume that a refcount will never overflow.
// As such, we do not need to cap incrementing.
@ -392,6 +413,13 @@ inline fn free_ptr_to_refcount(
roc_ops: *RocOps,
) void {
if (RC_TYPE == .none) return;
// Debug-only: Poison the refcount slot before freeing to detect use-after-free.
// Any subsequent access to this refcount will see POISON_VALUE and panic.
if (builtin.mode == .Debug) {
refcount_ptr[0] = POISON_VALUE;
}
const ptr_width = @sizeOf(usize);
const required_space: usize = if (elements_refcounted) (2 * ptr_width) else ptr_width;
const extra_bytes = @max(required_space, element_alignment);
@ -423,6 +451,18 @@ inline fn decref_ptr_to_refcount(
// Ensure that the refcount is not whole program lifetime.
const refcount: isize = refcount_ptr[0];
// Debug-only assertions to catch refcount bugs early.
// These compile out completely in release builds.
if (builtin.mode == .Debug) {
if (refcount == POISON_VALUE) {
@panic("Use-after-free: decref on already-freed memory");
}
if (refcount <= 0 and !rcConstant(refcount)) {
@panic("Refcount underflow: decrementing non-positive refcount");
}
}
if (!rcConstant(refcount)) {
switch (RC_TYPE) {
.normal => {
@ -493,6 +533,26 @@ pub inline fn rcConstant(refcount: isize) bool {
}
}
/// Debug-only assertion that a data pointer has a valid refcount.
/// Panics if the refcount is poisoned (use-after-free) or invalid (underflow).
/// Compiles to nothing in release builds - zero overhead.
///
/// Use this at key points in slice-creating or refcount-manipulating functions
/// to catch bugs early during development.
pub inline fn assertValidRefcount(data_ptr: ?[*]u8) void {
if (builtin.mode != .Debug) return;
if (data_ptr) |ptr| {
const rc_ptr: [*]isize = @ptrCast(@alignCast(ptr - @sizeOf(usize)));
const rc = rc_ptr[0];
if (rc == POISON_VALUE) {
@panic("assertValidRefcount: Use-after-free detected");
}
if (rc <= 0 and !rcConstant(rc)) {
@panic("assertValidRefcount: Invalid refcount (underflow or corruption)");
}
}
}
// We follow roughly the [fbvector](https://github.com/facebook/folly/blob/main/folly/docs/FBVector.md) when it comes to growing a RocList.
// Here is [their growth strategy](https://github.com/facebook/folly/blob/3e0525988fd444201b19b76b390a5927c15cb697/folly/FBVector.h#L1128) for push_back:
//

View file

@ -780,6 +780,322 @@ pub const Expr = union(enum) {
dec_to_f32_wrap, // Dec -> F32 (lossy narrowing)
dec_to_f32_try_unsafe, // Dec -> { success: Bool, val: F32 }
dec_to_f64, // Dec -> F64 (lossy conversion)
/// Ownership semantics for each argument of a low-level operation.
/// See src/builtins/OWNERSHIP.md for detailed documentation.
pub const ArgOwnership = enum {
/// Function reads argument without affecting refcount. Caller retains ownership.
/// Interpreter should decref after call.
borrow,
/// Function takes ownership of argument. Caller loses access.
/// Interpreter should NOT decref after call.
consume,
};
/// Returns the ownership semantics for each argument of this low-level operation.
/// The returned slice has one entry per argument.
///
/// Important: DO NOT ADD an else branch to this switch statement
/// we expect a compile error if a new case is added and ownership semantics are not defined here.
pub fn getArgOwnership(self: LowLevel) []const ArgOwnership {
return switch (self) {
// String operations - borrowing (read-only)
.str_is_empty, .str_is_eq, .str_contains, .str_starts_with, .str_ends_with, .str_count_utf8_bytes, .str_caseless_ascii_equals => &.{ .borrow, .borrow },
// String operations - consuming (take ownership)
.str_concat => &.{ .consume, .borrow }, // first consumed, second borrowed
.str_trim, .str_trim_start, .str_trim_end => &.{.consume},
.str_with_ascii_lowercased, .str_with_ascii_uppercased => &.{.consume},
.str_repeat => &.{ .borrow, .borrow }, // string borrowed, count is value type
.str_with_prefix => &.{ .consume, .borrow },
.str_with_capacity => &.{.borrow}, // capacity is value type
.str_reserve => &.{ .consume, .borrow },
.str_release_excess_capacity => &.{.consume},
.str_join_with => &.{ .consume, .borrow }, // list consumed, separator borrowed
.str_split_on => &.{ .consume, .borrow },
// String operations - borrowing with seamless slice result (incref internally)
.str_to_utf8 => &.{.borrow},
.str_drop_prefix, .str_drop_suffix => &.{ .borrow, .borrow },
// String parsing - list consumed
.str_from_utf8, .str_from_utf8_lossy => &.{.consume},
// Numeric to_str - value types (no ownership)
.u8_to_str, .i8_to_str, .u16_to_str, .i16_to_str, .u32_to_str, .i32_to_str, .u64_to_str, .i64_to_str, .u128_to_str, .i128_to_str, .dec_to_str, .f32_to_str, .f64_to_str => &.{.borrow},
// List operations - borrowing
.list_len, .list_is_empty, .list_get_unsafe => &.{.borrow},
// List operations - consuming
.list_concat => &.{ .consume, .consume },
.list_with_capacity => &.{.borrow}, // capacity is value type
.list_sort_with => &.{.consume},
.list_append => &.{ .consume, .borrow }, // list consumed, element borrowed
// Bool operations - value types
.bool_is_eq => &.{ .borrow, .borrow },
// Numeric operations - all value types (no heap allocation)
.num_is_zero, .num_is_negative, .num_is_positive, .num_negate => &.{.borrow},
.num_is_eq, .num_is_gt, .num_is_gte, .num_is_lt, .num_is_lte, .num_plus, .num_minus, .num_times, .num_div_by, .num_div_trunc_by, .num_rem_by => &.{ .borrow, .borrow },
// Numeric parsing - list borrowed for digits
.num_from_int_digits => &.{.borrow},
.num_from_dec_digits => &.{ .borrow, .borrow },
.num_from_numeral => &.{.borrow},
// All numeric conversions are value types (no heap allocation).
// Explicitly listed to get compile errors when new LowLevel variants are added.
.u8_to_i8_wrap,
.u8_to_i8_try,
.u8_to_i16,
.u8_to_i32,
.u8_to_i64,
.u8_to_i128,
.u8_to_u16,
.u8_to_u32,
.u8_to_u64,
.u8_to_u128,
.u8_to_f32,
.u8_to_f64,
.u8_to_dec,
.i8_to_i16,
.i8_to_i32,
.i8_to_i64,
.i8_to_i128,
.i8_to_u8_wrap,
.i8_to_u8_try,
.i8_to_u16_wrap,
.i8_to_u16_try,
.i8_to_u32_wrap,
.i8_to_u32_try,
.i8_to_u64_wrap,
.i8_to_u64_try,
.i8_to_u128_wrap,
.i8_to_u128_try,
.i8_to_f32,
.i8_to_f64,
.i8_to_dec,
.u16_to_i8_wrap,
.u16_to_i8_try,
.u16_to_i16_wrap,
.u16_to_i16_try,
.u16_to_i32,
.u16_to_i64,
.u16_to_i128,
.u16_to_u8_wrap,
.u16_to_u8_try,
.u16_to_u32,
.u16_to_u64,
.u16_to_u128,
.u16_to_f32,
.u16_to_f64,
.u16_to_dec,
.i16_to_i8_wrap,
.i16_to_i8_try,
.i16_to_i32,
.i16_to_i64,
.i16_to_i128,
.i16_to_u8_wrap,
.i16_to_u8_try,
.i16_to_u16_wrap,
.i16_to_u16_try,
.i16_to_u32_wrap,
.i16_to_u32_try,
.i16_to_u64_wrap,
.i16_to_u64_try,
.i16_to_u128_wrap,
.i16_to_u128_try,
.i16_to_f32,
.i16_to_f64,
.i16_to_dec,
.u32_to_i8_wrap,
.u32_to_i8_try,
.u32_to_i16_wrap,
.u32_to_i16_try,
.u32_to_i32_wrap,
.u32_to_i32_try,
.u32_to_i64,
.u32_to_i128,
.u32_to_u8_wrap,
.u32_to_u8_try,
.u32_to_u16_wrap,
.u32_to_u16_try,
.u32_to_u64,
.u32_to_u128,
.u32_to_f32,
.u32_to_f64,
.u32_to_dec,
.i32_to_i8_wrap,
.i32_to_i8_try,
.i32_to_i16_wrap,
.i32_to_i16_try,
.i32_to_i64,
.i32_to_i128,
.i32_to_u8_wrap,
.i32_to_u8_try,
.i32_to_u16_wrap,
.i32_to_u16_try,
.i32_to_u32_wrap,
.i32_to_u32_try,
.i32_to_u64_wrap,
.i32_to_u64_try,
.i32_to_u128_wrap,
.i32_to_u128_try,
.i32_to_f32,
.i32_to_f64,
.i32_to_dec,
.u64_to_i8_wrap,
.u64_to_i8_try,
.u64_to_i16_wrap,
.u64_to_i16_try,
.u64_to_i32_wrap,
.u64_to_i32_try,
.u64_to_i64_wrap,
.u64_to_i64_try,
.u64_to_i128,
.u64_to_u8_wrap,
.u64_to_u8_try,
.u64_to_u16_wrap,
.u64_to_u16_try,
.u64_to_u32_wrap,
.u64_to_u32_try,
.u64_to_u128,
.u64_to_f32,
.u64_to_f64,
.u64_to_dec,
.i64_to_i8_wrap,
.i64_to_i8_try,
.i64_to_i16_wrap,
.i64_to_i16_try,
.i64_to_i32_wrap,
.i64_to_i32_try,
.i64_to_i128,
.i64_to_u8_wrap,
.i64_to_u8_try,
.i64_to_u16_wrap,
.i64_to_u16_try,
.i64_to_u32_wrap,
.i64_to_u32_try,
.i64_to_u64_wrap,
.i64_to_u64_try,
.i64_to_u128_wrap,
.i64_to_u128_try,
.i64_to_f32,
.i64_to_f64,
.i64_to_dec,
.u128_to_i8_wrap,
.u128_to_i8_try,
.u128_to_i16_wrap,
.u128_to_i16_try,
.u128_to_i32_wrap,
.u128_to_i32_try,
.u128_to_i64_wrap,
.u128_to_i64_try,
.u128_to_i128_wrap,
.u128_to_i128_try,
.u128_to_u8_wrap,
.u128_to_u8_try,
.u128_to_u16_wrap,
.u128_to_u16_try,
.u128_to_u32_wrap,
.u128_to_u32_try,
.u128_to_u64_wrap,
.u128_to_u64_try,
.u128_to_f32,
.u128_to_f64,
.u128_to_dec_try_unsafe,
.i128_to_i8_wrap,
.i128_to_i8_try,
.i128_to_i16_wrap,
.i128_to_i16_try,
.i128_to_i32_wrap,
.i128_to_i32_try,
.i128_to_i64_wrap,
.i128_to_i64_try,
.i128_to_u8_wrap,
.i128_to_u8_try,
.i128_to_u16_wrap,
.i128_to_u16_try,
.i128_to_u32_wrap,
.i128_to_u32_try,
.i128_to_u64_wrap,
.i128_to_u64_try,
.i128_to_u128_wrap,
.i128_to_u128_try,
.i128_to_f32,
.i128_to_f64,
.i128_to_dec_try_unsafe,
.f32_to_i8_trunc,
.f32_to_i8_try_unsafe,
.f32_to_i16_trunc,
.f32_to_i16_try_unsafe,
.f32_to_i32_trunc,
.f32_to_i32_try_unsafe,
.f32_to_i64_trunc,
.f32_to_i64_try_unsafe,
.f32_to_i128_trunc,
.f32_to_i128_try_unsafe,
.f32_to_u8_trunc,
.f32_to_u8_try_unsafe,
.f32_to_u16_trunc,
.f32_to_u16_try_unsafe,
.f32_to_u32_trunc,
.f32_to_u32_try_unsafe,
.f32_to_u64_trunc,
.f32_to_u64_try_unsafe,
.f32_to_u128_trunc,
.f32_to_u128_try_unsafe,
.f32_to_f64,
.f64_to_i8_trunc,
.f64_to_i8_try_unsafe,
.f64_to_i16_trunc,
.f64_to_i16_try_unsafe,
.f64_to_i32_trunc,
.f64_to_i32_try_unsafe,
.f64_to_i64_trunc,
.f64_to_i64_try_unsafe,
.f64_to_i128_trunc,
.f64_to_i128_try_unsafe,
.f64_to_u8_trunc,
.f64_to_u8_try_unsafe,
.f64_to_u16_trunc,
.f64_to_u16_try_unsafe,
.f64_to_u32_trunc,
.f64_to_u32_try_unsafe,
.f64_to_u64_trunc,
.f64_to_u64_try_unsafe,
.f64_to_u128_trunc,
.f64_to_u128_try_unsafe,
.f64_to_f32_wrap,
.f64_to_f32_try_unsafe,
.dec_to_i8_trunc,
.dec_to_i8_try_unsafe,
.dec_to_i16_trunc,
.dec_to_i16_try_unsafe,
.dec_to_i32_trunc,
.dec_to_i32_try_unsafe,
.dec_to_i64_trunc,
.dec_to_i64_try_unsafe,
.dec_to_i128_trunc,
.dec_to_i128_try_unsafe,
.dec_to_u8_trunc,
.dec_to_u8_try_unsafe,
.dec_to_u16_trunc,
.dec_to_u16_try_unsafe,
.dec_to_u32_trunc,
.dec_to_u32_try_unsafe,
.dec_to_u64_trunc,
.dec_to_u64_try_unsafe,
.dec_to_u128_trunc,
.dec_to_u128_try_unsafe,
.dec_to_f32_wrap,
.dec_to_f32_try_unsafe,
.dec_to_f64,
=> &.{.borrow},
};
}
};
pub const Idx = enum(u32) { _ };

View file

@ -67,6 +67,22 @@ pub fn copyToPtr(self: StackValue, layout_cache: *LayoutStore, dest_ptr: *anyopa
const src_str: *const RocStr = @ptrCast(@alignCast(self.ptr.?));
const dest_str: *RocStr = @ptrCast(@alignCast(dest_ptr));
dest_str.* = src_str.*;
if (comptime trace_refcount) {
if (!src_str.isSmallStr()) {
const alloc_ptr = src_str.getAllocationPtr();
const rc_before: isize = if (alloc_ptr) |ptr| blk: {
if (@intFromPtr(ptr) % 8 != 0) break :blk -999;
const isizes: [*]isize = @ptrCast(@alignCast(ptr));
break :blk (isizes - 1)[0];
} else 0;
traceRefcount("INCREF str (copyToPtr) ptr=0x{x} len={} rc={} slice={}", .{
@intFromPtr(alloc_ptr),
src_str.len(),
rc_before,
@intFromBool(src_str.isSeamlessSlice()),
});
}
}
src_str.incref(1);
return;
},
@ -146,10 +162,32 @@ pub fn copyToPtr(self: StackValue, layout_cache: *LayoutStore, dest_ptr: *anyopa
const src_list: *const builtins.list.RocList = @ptrCast(@alignCast(self.ptr.?));
const dest_list: *builtins.list.RocList = @ptrCast(@alignCast(dest_ptr));
dest_list.* = src_list.*;
// Incref the list data if it's not empty
if (src_list.bytes) |bytes| {
builtins.utils.increfDataPtrC(bytes, 1);
const elem_layout = layout_cache.getLayout(self.layout.data.list);
const elements_refcounted = elem_layout.isRefcounted();
// Incref the list allocation. For seamless slices, this is the parent allocation,
// not the bytes pointer (which points within the parent allocation).
// We use getAllocationDataPtr() which correctly handles both regular lists
// and seamless slices (where capacity_or_alloc_ptr stores the parent pointer).
if (src_list.getAllocationDataPtr()) |alloc_ptr| {
if (comptime trace_refcount) {
const rc_before: isize = blk: {
if (@intFromPtr(alloc_ptr) % 8 != 0) break :blk -999;
const isizes: [*]isize = @ptrCast(@alignCast(alloc_ptr));
break :blk (isizes - 1)[0];
};
traceRefcount("INCREF list (copyToPtr) ptr=0x{x} len={} rc={} slice={} elems_rc={}", .{
@intFromPtr(alloc_ptr),
src_list.len(),
rc_before,
@intFromBool(src_list.isSeamlessSlice()),
@intFromBool(elements_refcounted),
});
}
builtins.utils.increfDataPtrC(alloc_ptr, 1);
}
storeListElementCount(dest_list, elements_refcounted);
return;
}
@ -162,6 +200,181 @@ pub fn copyToPtr(self: StackValue, layout_cache: *LayoutStore, dest_ptr: *anyopa
return;
}
if (self.layout.tag == .record) {
// Copy raw bytes first, then recursively incref all fields
// We call incref on ALL fields (not just isRefcounted()) because:
// - For directly refcounted types (str, list, box): increfs them
// - For nested records/tuples: recursively handles their contents
// - For scalars: incref is a no-op
// This is symmetric with decref which also processes all fields.
std.debug.assert(self.ptr != null);
const src = @as([*]u8, @ptrCast(self.ptr.?))[0..result_size];
const dst = @as([*]u8, @ptrCast(dest_ptr))[0..result_size];
@memcpy(dst, src);
const record_data = layout_cache.getRecordData(self.layout.data.record.idx);
if (record_data.fields.count == 0) return;
const field_layouts = layout_cache.record_fields.sliceRange(record_data.getFields());
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
var field_index: usize = 0;
while (field_index < field_layouts.len) : (field_index += 1) {
const field_info = field_layouts.get(field_index);
const field_layout = layout_cache.getLayout(field_info.layout);
const field_offset = layout_cache.getRecordFieldOffset(self.layout.data.record.idx, @intCast(field_index));
const field_ptr = @as(*anyopaque, @ptrCast(base_ptr + field_offset));
const field_value = StackValue{
.layout = field_layout,
.ptr = field_ptr,
.is_initialized = true,
};
field_value.incref(layout_cache);
}
return;
}
if (self.layout.tag == .tuple) {
// Copy raw bytes first, then recursively incref all elements
// We call incref on ALL elements (not just isRefcounted()) because:
// - For directly refcounted types (str, list, box): increfs them
// - For nested records/tuples: recursively handles their contents
// - For scalars: incref is a no-op
// This is symmetric with decref which also processes all elements.
std.debug.assert(self.ptr != null);
const src = @as([*]u8, @ptrCast(self.ptr.?))[0..result_size];
const dst = @as([*]u8, @ptrCast(dest_ptr))[0..result_size];
@memcpy(dst, src);
const tuple_data = layout_cache.getTupleData(self.layout.data.tuple.idx);
if (tuple_data.fields.count == 0) return;
const element_layouts = layout_cache.tuple_fields.sliceRange(tuple_data.getFields());
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
var elem_index: usize = 0;
while (elem_index < element_layouts.len) : (elem_index += 1) {
const elem_info = element_layouts.get(elem_index);
const elem_layout = layout_cache.getLayout(elem_info.layout);
const elem_offset = layout_cache.getTupleElementOffset(self.layout.data.tuple.idx, @intCast(elem_index));
const elem_ptr = @as(*anyopaque, @ptrCast(base_ptr + elem_offset));
const elem_value = StackValue{
.layout = elem_layout,
.ptr = elem_ptr,
.is_initialized = true,
};
elem_value.incref(layout_cache);
}
return;
}
if (self.layout.tag == .closure) {
// Copy the closure header and captures, then incref captured values.
// Closures store captures in a record immediately after the header.
std.debug.assert(self.ptr != null);
const src = @as([*]u8, @ptrCast(self.ptr.?))[0..result_size];
const dst = @as([*]u8, @ptrCast(dest_ptr))[0..result_size];
@memcpy(dst, src);
// Get the closure header to find the captures layout
const closure = self.asClosure();
const captures_layout = layout_cache.getLayout(closure.captures_layout_idx);
// Only incref if there are actual captures (record with fields)
if (captures_layout.tag == .record) {
const record_data = layout_cache.getRecordData(captures_layout.data.record.idx);
if (record_data.fields.count > 0) {
if (comptime trace_refcount) {
traceRefcount("INCREF closure captures ptr=0x{x} fields={}", .{
@intFromPtr(self.ptr),
record_data.fields.count,
});
}
// Calculate the offset to the captures record (after header, with alignment)
const header_size = @sizeOf(layout_mod.Closure);
const cap_align = captures_layout.alignment(layout_cache.targetUsize());
const aligned_off = std.mem.alignForward(usize, header_size, @intCast(cap_align.toByteUnits()));
const base_ptr: [*]u8 = @ptrCast(@alignCast(self.ptr.?));
const rec_ptr: [*]u8 = @ptrCast(base_ptr + aligned_off);
// Iterate over each field in the captures record and incref all fields.
// We call incref on ALL fields (not just isRefcounted()) because:
// - For directly refcounted types (str, list, box): increfs them
// - For nested records/tuples: recursively handles their contents
// - For scalars: incref is a no-op
// This is symmetric with decref.
const field_layouts = layout_cache.record_fields.sliceRange(record_data.getFields());
var field_index: usize = 0;
while (field_index < field_layouts.len) : (field_index += 1) {
const field_info = field_layouts.get(field_index);
const field_layout = layout_cache.getLayout(field_info.layout);
const field_offset = layout_cache.getRecordFieldOffset(captures_layout.data.record.idx, @intCast(field_index));
const field_ptr = @as(*anyopaque, @ptrCast(rec_ptr + field_offset));
const field_value = StackValue{
.layout = field_layout,
.ptr = field_ptr,
.is_initialized = true,
};
field_value.incref(layout_cache);
}
}
}
return;
}
if (self.layout.tag == .tag_union) {
// Copy raw bytes first, then incref only the active variant's payload
std.debug.assert(self.ptr != null);
const src = @as([*]u8, @ptrCast(self.ptr.?))[0..result_size];
const dst = @as([*]u8, @ptrCast(dest_ptr))[0..result_size];
@memcpy(dst, src);
const tu_data = layout_cache.getTagUnionData(self.layout.data.tag_union.idx);
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
// Read discriminant to determine active variant
const disc_ptr = base_ptr + tu_data.discriminant_offset;
const discriminant: u32 = switch (tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
else => unreachable,
};
// Get the active variant's payload layout
const variants = layout_cache.getTagUnionVariants(tu_data);
if (discriminant >= variants.len) return; // Invalid discriminant, skip
const variant_layout = layout_cache.getLayout(variants.get(discriminant).payload_layout);
if (comptime trace_refcount) {
traceRefcount("INCREF tag_union (copyToPtr) disc={} variant_layout.tag={}", .{
discriminant,
@intFromEnum(variant_layout.tag),
});
}
// Incref only the active variant's payload (at offset 0)
const payload_value = StackValue{
.layout = variant_layout,
.ptr = @as(*anyopaque, @ptrCast(base_ptr)),
.is_initialized = true,
};
payload_value.incref(layout_cache);
return;
}
std.debug.assert(self.ptr != null);
const src = @as([*]u8, @ptrCast(self.ptr.?))[0..result_size];
const dst = @as([*]u8, @ptrCast(dest_ptr))[0..result_size];
@ -568,6 +781,68 @@ pub const TupleAccessor = struct {
}
};
/// Create a TagUnionAccessor for safe tag union access
pub fn asTagUnion(self: StackValue, layout_cache: *LayoutStore) !TagUnionAccessor {
std.debug.assert(self.is_initialized);
std.debug.assert(self.ptr != null);
std.debug.assert(self.layout.tag == .tag_union);
const tu_data = layout_cache.getTagUnionData(self.layout.data.tag_union.idx);
return TagUnionAccessor{
.base_value = self,
.layout_cache = layout_cache,
.tu_data = tu_data.*,
};
}
/// Safe accessor for tag union values
pub const TagUnionAccessor = struct {
base_value: StackValue,
layout_cache: *LayoutStore,
tu_data: layout_mod.TagUnionData,
/// Read the discriminant (tag index) from the tag union
pub fn getDiscriminant(self: TagUnionAccessor) usize {
const base_ptr: [*]u8 = @ptrCast(self.base_value.ptr.?);
const disc_ptr = base_ptr + self.tu_data.discriminant_offset;
return switch (self.tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
8 => @intCast(@as(*const u64, @ptrCast(@alignCast(disc_ptr))).*),
else => 0,
};
}
/// Get the layout for a specific variant by discriminant
pub fn getVariantLayout(self: *const TagUnionAccessor, discriminant: usize) Layout {
const variants = self.layout_cache.getTagUnionVariants(&self.tu_data);
if (discriminant >= variants.len) {
return Layout.zst();
}
const variant = variants.get(discriminant);
return self.layout_cache.getLayout(variant.payload_layout);
}
/// Get a StackValue for the payload at offset 0
pub fn getPayload(self: TagUnionAccessor, payload_layout: Layout) StackValue {
// Payload is always at offset 0 in our tag union layout
return StackValue{
.layout = payload_layout,
.ptr = self.base_value.ptr,
.is_initialized = true,
};
}
/// Get discriminant and payload layout together
pub fn getVariant(self: *const TagUnionAccessor) struct { discriminant: usize, payload_layout: Layout } {
const discriminant = self.getDiscriminant();
const payload_layout = self.getVariantLayout(discriminant);
return .{ .discriminant = discriminant, .payload_layout = payload_layout };
}
};
/// Create a ListAccessor for safe list element access
pub fn asList(self: StackValue, layout_cache: *LayoutStore, element_layout: Layout) !ListAccessor {
std.debug.assert(self.is_initialized);
@ -827,6 +1102,22 @@ pub fn copyTo(self: StackValue, dest: StackValue, layout_cache: *LayoutStore) vo
const src_str: *const RocStr = @ptrCast(@alignCast(self.ptr.?));
const dest_str: *RocStr = @ptrCast(@alignCast(dest.ptr.?));
dest_str.* = src_str.*;
if (comptime trace_refcount) {
if (!src_str.isSmallStr()) {
const alloc_ptr = src_str.getAllocationPtr();
const rc_before: isize = if (alloc_ptr) |ptr| blk: {
if (@intFromPtr(ptr) % 8 != 0) break :blk -999;
const isizes: [*]isize = @ptrCast(@alignCast(ptr));
break :blk (isizes - 1)[0];
} else 0;
traceRefcount("INCREF str (copyTo) ptr=0x{x} len={} rc={} slice={}", .{
@intFromPtr(alloc_ptr),
src_str.len(),
rc_before,
@intFromBool(src_str.isSeamlessSlice()),
});
}
}
dest_str.incref(1);
return;
}
@ -916,8 +1207,9 @@ pub fn copyWithoutRefcount(self: StackValue, dest: StackValue, layout_cache: *La
}
}
/// Increment reference count for refcounted types
pub fn incref(self: StackValue) void {
/// Increment reference count for refcounted types.
/// Must be symmetric with decref - handles records and tuples by recursively incref'ing fields.
pub fn incref(self: StackValue, layout_cache: *LayoutStore) void {
if (comptime trace_refcount) {
traceRefcount("INCREF layout.tag={} ptr=0x{x}", .{ @intFromEnum(self.layout.tag), @intFromPtr(self.ptr) });
}
@ -975,6 +1267,94 @@ pub fn incref(self: StackValue) void {
}
return;
}
// Handle records by recursively incref'ing each field (symmetric with decref)
if (self.layout.tag == .record) {
if (self.ptr == null) return;
const record_data = layout_cache.getRecordData(self.layout.data.record.idx);
if (record_data.fields.count == 0) return;
const field_layouts = layout_cache.record_fields.sliceRange(record_data.getFields());
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
var field_index: usize = 0;
while (field_index < field_layouts.len) : (field_index += 1) {
const field_info = field_layouts.get(field_index);
const field_layout = layout_cache.getLayout(field_info.layout);
const field_offset = layout_cache.getRecordFieldOffset(self.layout.data.record.idx, @intCast(field_index));
const field_ptr = @as(*anyopaque, @ptrCast(base_ptr + field_offset));
const field_value = StackValue{
.layout = field_layout,
.ptr = field_ptr,
.is_initialized = true,
};
field_value.incref(layout_cache);
}
return;
}
// Handle tuples by recursively incref'ing each element (symmetric with decref)
if (self.layout.tag == .tuple) {
if (self.ptr == null) return;
const tuple_data = layout_cache.getTupleData(self.layout.data.tuple.idx);
if (tuple_data.fields.count == 0) return;
const element_layouts = layout_cache.tuple_fields.sliceRange(tuple_data.getFields());
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
var elem_index: usize = 0;
while (elem_index < element_layouts.len) : (elem_index += 1) {
const elem_info = element_layouts.get(elem_index);
const elem_layout = layout_cache.getLayout(elem_info.layout);
const elem_offset = layout_cache.getTupleElementOffset(self.layout.data.tuple.idx, @intCast(elem_index));
const elem_ptr = @as(*anyopaque, @ptrCast(base_ptr + elem_offset));
const elem_value = StackValue{
.layout = elem_layout,
.ptr = elem_ptr,
.is_initialized = true,
};
elem_value.incref(layout_cache);
}
return;
}
// Handle tag unions by reading discriminant and incref'ing only the active variant's payload
if (self.layout.tag == .tag_union) {
if (self.ptr == null) return;
const tu_data = layout_cache.getTagUnionData(self.layout.data.tag_union.idx);
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
// Read discriminant to determine active variant
const disc_ptr = base_ptr + tu_data.discriminant_offset;
const discriminant: u32 = switch (tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
else => unreachable,
};
// Get the active variant's payload layout
const variants = layout_cache.getTagUnionVariants(tu_data);
if (discriminant >= variants.len) return; // Invalid discriminant, skip
const variant_layout = layout_cache.getLayout(variants.get(discriminant).payload_layout);
// Incref only the active variant's payload (at offset 0)
const payload_value = StackValue{
.layout = variant_layout,
.ptr = @as(*anyopaque, @ptrCast(base_ptr)),
.is_initialized = true,
};
if (comptime trace_refcount) {
traceRefcount("INCREF tag_union disc={} variant_layout.tag={}", .{ discriminant, @intFromEnum(variant_layout.tag) });
}
payload_value.incref(layout_cache);
return;
}
}
/// Trace helper for refcount operations. Only active when built with -Dtrace-refcount=true.
@ -1056,7 +1436,10 @@ pub fn decref(self: StackValue, layout_cache: *LayoutStore, ops: *RocOps) void {
});
}
if (elements_refcounted and list_value.isUnique()) {
// Always decref elements when unique, not just when isRefcounted().
// Records/tuples containing refcounted values also need their fields decreffed.
// Decref for non-refcounted types (like plain integers) is a no-op.
if (list_value.isUnique()) {
if (list_value.getAllocationDataPtr()) |source| {
const count: usize = if (list_value.isSeamlessSlice()) blk: {
const ptr = @as([*]usize, @ptrCast(@alignCast(source))) - 2;
@ -1240,6 +1623,44 @@ pub fn decref(self: StackValue, layout_cache: *LayoutStore, ops: *RocOps) void {
}
return;
},
.tag_union => {
if (self.ptr == null) return;
const tu_data = layout_cache.getTagUnionData(self.layout.data.tag_union.idx);
const base_ptr = @as([*]u8, @ptrCast(self.ptr.?));
// Read discriminant to determine active variant
const disc_ptr = base_ptr + tu_data.discriminant_offset;
const discriminant: u32 = switch (tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
else => unreachable,
};
// Get the active variant's payload layout
const variants = layout_cache.getTagUnionVariants(tu_data);
if (discriminant >= variants.len) return; // Invalid discriminant, skip
const variant_layout = layout_cache.getLayout(variants.get(discriminant).payload_layout);
if (comptime trace_refcount) {
traceRefcount("DECREF tag_union ptr=0x{x} disc={} variant_layout.tag={}", .{
@intFromPtr(self.ptr),
discriminant,
@intFromEnum(variant_layout.tag),
});
}
// Decref only the active variant's payload (at offset 0)
const payload_value = StackValue{
.layout = variant_layout,
.ptr = @as(*anyopaque, @ptrCast(base_ptr)),
.is_initialized = true,
};
payload_value.decref(layout_cache, ops);
return;
},
else => {},
}

View file

@ -1376,6 +1376,12 @@ pub const ComptimeEvaluator = struct {
return true; // Ok
}
return true; // Unknown format, optimistically allow
} else if (result.layout.tag == .tag_union) {
// Tag union layout: payload at offset 0, discriminant at discriminant_offset
// For Try types from num.from_numeral, the interpreter should have stored
// the error message in last_error_message, which was already checked above.
// If we reach here without a last_error_message, assume it's Ok.
return true;
}
return true; // Unknown format, optimistically allow

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,7 @@
//! Helpers for rendering interpreter values back into readable Roc syntax.
const std = @import("std");
const builtin = @import("builtin");
const types = @import("types");
const can = @import("can");
const layout = @import("layout");
@ -241,6 +242,96 @@ pub fn renderValueRocWithType(ctx: *RenderCtx, value: StackValue, rt_var: types.
}
return out.toOwnedSlice();
}
} else if (value.layout.tag == .tag_union) {
// Tag union with new proper layout: payload at offset 0, discriminant at discriminant_offset
const tu_data = ctx.layout_store.getTagUnionData(value.layout.data.tag_union.idx);
if (value.ptr) |ptr| {
const base_ptr: [*]u8 = @ptrCast(ptr);
const disc_ptr = base_ptr + tu_data.discriminant_offset;
// Read discriminant based on its size
const discriminant: usize = switch (tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
8 => @intCast(@as(*const u64, @ptrCast(@alignCast(disc_ptr))).*),
else => 0,
};
tag_index = discriminant;
have_tag = true;
}
if (have_tag and tag_index < tags.len) {
const tag_name = ctx.env.getIdent(tags.items(.name)[tag_index]);
var out = std.array_list.AlignedManaged(u8, null).init(gpa);
errdefer out.deinit();
try out.appendSlice(tag_name);
const args_range = tags.items(.args)[tag_index];
const arg_vars = ctx.runtime_types.sliceVars(toVarRange(args_range));
if (arg_vars.len > 0) {
try out.append('(');
// Payload is at offset 0
const payload_ptr: *anyopaque = @ptrCast(value.ptr.?);
if (arg_vars.len == 1) {
const arg_var = arg_vars[0];
const layout_idx = try ctx.layout_store.addTypeVar(arg_var, ctx.type_scope);
const arg_layout = ctx.layout_store.getLayout(layout_idx);
const payload_value = StackValue{
.layout = arg_layout,
.ptr = payload_ptr,
.is_initialized = true,
};
const rendered = try renderValueRocWithType(ctx, payload_value, arg_var);
defer gpa.free(rendered);
try out.appendSlice(rendered);
} else {
// Multiple payloads: create a tuple layout from arg types
var elem_layouts = try ctx.allocator.alloc(layout.Layout, arg_vars.len);
defer ctx.allocator.free(elem_layouts);
var i: usize = 0;
while (i < arg_vars.len) : (i += 1) {
const idx = try ctx.layout_store.addTypeVar(arg_vars[i], ctx.type_scope);
elem_layouts[i] = ctx.layout_store.getLayout(idx);
}
const tuple_idx = try ctx.layout_store.putTuple(elem_layouts);
const tuple_layout = ctx.layout_store.getLayout(tuple_idx);
const tuple_size = ctx.layout_store.layoutSize(tuple_layout);
if (tuple_size == 0) {
var j: usize = 0;
while (j < arg_vars.len) : (j += 1) {
const rendered = try renderValueRocWithType(
ctx,
StackValue{
.layout = elem_layouts[j],
.ptr = null,
.is_initialized = true,
},
arg_vars[j],
);
defer gpa.free(rendered);
try out.appendSlice(rendered);
if (j + 1 < arg_vars.len) try out.appendSlice(", ");
}
} else {
const tuple_value = StackValue{
.layout = tuple_layout,
.ptr = payload_ptr,
.is_initialized = true,
};
var tup_acc = try tuple_value.asTuple(ctx.layout_store);
var j: usize = 0;
while (j < arg_vars.len) : (j += 1) {
const sorted_idx = tup_acc.findElementIndexByOriginal(j) orelse return error.TypeMismatch;
const elem_value = try tup_acc.getElement(sorted_idx);
const rendered = try renderValueRocWithType(ctx, elem_value, arg_vars[j]);
defer gpa.free(rendered);
try out.appendSlice(rendered);
if (j + 1 < arg_vars.len) try out.appendSlice(", ");
}
}
}
try out.append(')');
}
return out.toOwnedSlice();
}
}
},
.nominal_type => |nominal| {
@ -487,6 +578,27 @@ pub fn renderValueRoc(ctx: *RenderCtx, value: StackValue) ![]u8 {
try out.appendSlice(" }");
return out.toOwnedSlice();
}
if (value.layout.tag == .tag_union) {
// Layout-only fallback for tag_union: show discriminant and raw payload
const tu_data = ctx.layout_store.getTagUnionData(value.layout.data.tag_union.idx);
var out = std.array_list.AlignedManaged(u8, null).init(gpa);
errdefer out.deinit();
if (value.ptr) |ptr| {
const base_ptr: [*]u8 = @ptrCast(ptr);
const disc_ptr = base_ptr + tu_data.discriminant_offset;
const discriminant: usize = switch (tu_data.discriminant_size) {
1 => @as(*const u8, @ptrCast(disc_ptr)).*,
2 => @as(*const u16, @ptrCast(@alignCast(disc_ptr))).*,
4 => @as(*const u32, @ptrCast(@alignCast(disc_ptr))).*,
8 => @intCast(@as(*const u64, @ptrCast(@alignCast(disc_ptr))).*),
else => 0,
};
try std.fmt.format(out.writer(), "<tag_union variant={d}>", .{discriminant});
} else {
try out.appendSlice("<tag_union>");
}
return out.toOwnedSlice();
}
return try std.fmt.allocPrint(gpa, "<unsupported>", .{});
}

View file

@ -302,8 +302,10 @@ test "interpreter: F64 division" {
, 0.5, .no_trace);
}
test "interpreter: literal True renders True" {
const roc_src = "True";
test "interpreter: literal tag renders as tag name" {
// Use a custom tag instead of True - True is a Bool tag which requires
// proper builtin module resolution to get the nominal type
const roc_src = "MyTag";
const resources = try helpers.parseAndCanonicalizeExpr(std.testing.allocator, roc_src);
defer helpers.cleanupParseAndCanonical(std.testing.allocator, resources);
@ -318,7 +320,7 @@ test "interpreter: literal True renders True" {
const rt_var = try interp2.translateTypeVar(resources.module_env, can.ModuleEnv.varFrom(resources.expr_idx));
const rendered = try interp2.renderValueRocWithType(result, rt_var);
defer std.testing.allocator.free(rendered);
try std.testing.expectEqualStrings("True", rendered);
try std.testing.expectEqualStrings("MyTag", rendered);
}
test "interpreter: True == False yields False" {

View file

@ -26,6 +26,7 @@ pub const LayoutTag = enum(u4) {
tuple,
closure,
zst, // Zero-sized type (empty records, empty tuples, phantom types, etc.)
tag_union, // Tag union with variant-specific layouts for proper refcounting
};
/// The Layout untagged union should take up this many bits in memory.
@ -137,6 +138,7 @@ pub const LayoutUnion = packed union {
tuple: TupleLayout,
closure: ClosureLayout,
zst: void,
tag_union: TagUnionLayout,
};
/// Record field layout
@ -236,6 +238,51 @@ pub const TupleData = struct {
}
};
/// Tag union layout - stores alignment and index to full data in Store
/// This preserves variant information needed for correct reference counting.
pub const TagUnionLayout = packed struct {
/// Alignment of the tag union
alignment: std.mem.Alignment,
/// Index into the Store's tag union data
idx: TagUnionIdx,
};
/// Index into the Store's tag union data
pub const TagUnionIdx = packed struct {
int_idx: @Type(.{
.int = .{
.signedness = .unsigned,
// Same bit budget as RecordIdx/TupleIdx
.bits = layout_bit_size - @bitSizeOf(LayoutTag) - @bitSizeOf(std.mem.Alignment),
},
}),
};
/// Tag union data stored in the layout Store
pub const TagUnionData = struct {
/// Size of the tag union, in bytes (max payload + discriminant, aligned)
size: u32,
/// Offset of the discriminant within the union (usually after payload)
discriminant_offset: u16,
/// Size of the discriminant in bytes (1, 2, or 4)
discriminant_size: u8,
/// Range of variants in the tag_union_variants list
variants: collections.NonEmptyRange,
pub fn getVariants(self: TagUnionData) TagUnionVariant.SafeMultiList.Range {
return self.variants.toRange(TagUnionVariant.SafeMultiList.Idx);
}
};
/// Per-variant information for tag unions
pub const TagUnionVariant = struct {
/// The layout of this variant's payload
payload_layout: Idx,
/// A SafeMultiList for storing tag union variants
pub const SafeMultiList = collections.SafeMultiList(TagUnionVariant);
};
/// Roc's version of alignment that is limited to a max alignment of 16B to save bits.
pub const RocAlignment = enum(u3) {
@"1" = 0,
@ -317,6 +364,7 @@ pub const Layout = packed struct {
.list, .list_of_zst => target_usize.alignment(),
.record => self.data.record.alignment,
.tuple => self.data.tuple.alignment,
.tag_union => self.data.tag_union.alignment,
.closure => target_usize.alignment(),
.zst => std.mem.Alignment.@"1",
};
@ -399,6 +447,11 @@ pub const Layout = packed struct {
return Layout{ .data = .{ .zst = {} }, .tag = .zst };
}
/// tag union layout with the given alignment and tag union metadata
pub fn tagUnion(tu_alignment: std.mem.Alignment, tu_idx: TagUnionIdx) Layout {
return Layout{ .data = .{ .tag_union = .{ .alignment = tu_alignment, .idx = tu_idx } }, .tag = .tag_union };
}
/// Check if a layout represents a heap-allocated type that needs refcounting
pub fn isRefcounted(self: Layout) bool {
return switch (self.tag) {

View file

@ -30,6 +30,10 @@ pub const TupleFieldLayout = @import("layout.zig").TupleFieldLayout;
pub const TupleLayout = @import("layout.zig").TupleLayout;
pub const TupleIdx = @import("layout.zig").TupleIdx;
pub const TupleData = @import("layout.zig").TupleData;
pub const TagUnionLayout = @import("layout.zig").TagUnionLayout;
pub const TagUnionIdx = @import("layout.zig").TagUnionIdx;
pub const TagUnionData = @import("layout.zig").TagUnionData;
pub const TagUnionVariant = @import("layout.zig").TagUnionVariant;
pub const ClosureLayout = @import("layout.zig").ClosureLayout;
pub const RocAlignment = @import("layout.zig").RocAlignment;
pub const SizeAlign = @import("layout.zig").SizeAlign;

View file

@ -29,6 +29,10 @@ const RecordIdx = layout_mod.RecordIdx;
const TupleField = layout_mod.TupleField;
const TupleData = layout_mod.TupleData;
const TupleIdx = layout_mod.TupleIdx;
const TagUnionVariant = layout_mod.TagUnionVariant;
const TagUnionData = layout_mod.TagUnionData;
const TagUnionIdx = layout_mod.TagUnionIdx;
const TagUnionLayout = layout_mod.TagUnionLayout;
const SizeAlign = layout_mod.SizeAlign;
const Work = work.Work;
@ -55,6 +59,8 @@ pub const Store = struct {
record_data: collections.SafeList(RecordData),
tuple_fields: TupleField.SafeMultiList,
tuple_data: collections.SafeList(TupleData),
tag_union_variants: TagUnionVariant.SafeMultiList,
tag_union_data: collections.SafeList(TagUnionData),
// Cache to avoid duplicate work
layouts_by_var: collections.ArrayListMap(Var, Idx),
@ -169,6 +175,8 @@ pub const Store = struct {
.record_data = try collections.SafeList(RecordData).initCapacity(env.gpa, 256),
.tuple_fields = try TupleField.SafeMultiList.initCapacity(env.gpa, 256),
.tuple_data = try collections.SafeList(TupleData).initCapacity(env.gpa, 256),
.tag_union_variants = try TagUnionVariant.SafeMultiList.initCapacity(env.gpa, 64),
.tag_union_data = try collections.SafeList(TagUnionData).initCapacity(env.gpa, 64),
.layouts_by_var = layouts_by_var,
.work = try Work.initCapacity(env.gpa, 32),
.builtin_str_ident = builtin_str_ident,
@ -198,6 +206,8 @@ pub const Store = struct {
self.record_data.deinit(self.env.gpa);
self.tuple_fields.deinit(self.env.gpa);
self.tuple_data.deinit(self.env.gpa);
self.tag_union_variants.deinit(self.env.gpa);
self.tag_union_data.deinit(self.env.gpa);
self.layouts_by_var.deinit(self.env.gpa);
self.work.deinit(self.env.gpa);
}
@ -398,6 +408,14 @@ pub const Store = struct {
return self.tuple_data.get(@enumFromInt(idx.int_idx));
}
pub fn getTagUnionData(self: *const Self, idx: TagUnionIdx) *const TagUnionData {
return self.tag_union_data.get(@enumFromInt(idx.int_idx));
}
pub fn getTagUnionVariants(self: *const Self, data: *const TagUnionData) TagUnionVariant.SafeMultiList.Slice {
return self.tag_union_variants.sliceRange(data.getVariants());
}
pub fn getRecordFieldOffset(self: *const Self, record_idx: RecordIdx, field_index_in_sorted_fields: u32) u32 {
const target_usize = self.targetUsize();
const record_data = self.getRecordData(record_idx);
@ -511,10 +529,12 @@ pub const Store = struct {
/// Get or create a zero-sized type layout
pub fn ensureZstLayout(self: *Self) !Idx {
// Check if we already have a ZST layout
for (0..self.layouts.len()) |i| {
const layout = self.layouts.get(i);
const len: u32 = @intCast(self.layouts.len());
for (0..len) |i| {
const idx: Idx = @enumFromInt(i);
const layout = self.getLayout(idx);
if (layout.tag == .zst) {
return @enumFromInt(i);
return idx;
}
}
@ -550,6 +570,7 @@ pub const Store = struct {
const captures_size = self.layoutSize(captures_layout);
return aligned_captures_offset + captures_size;
},
.tag_union => self.tag_union_data.get(@enumFromInt(layout.data.tag_union.idx.int_idx)).size,
.zst => 0, // Zero-sized types have size 0
};
}
@ -1209,95 +1230,113 @@ pub const Store = struct {
}
// Complex tag union with payloads
// Strategy: represent as a record { payload: MaxPayload, tag: Discriminant }
// to ensure the payload receives its required alignment and the discriminant
// can live in any trailing padding that remains.
var max_payload_layout: ?Layout = null;
// Create a proper tag_union layout that preserves all variant layouts
// for correct reference counting at runtime.
var max_payload_size: u32 = 0;
var max_payload_alignment: std.mem.Alignment = std.mem.Alignment.@"1";
var max_payload_alignment_any: std.mem.Alignment = std.mem.Alignment.@"1";
// Helper to update max with a candidate layout
const updateMax = struct {
fn go(
store: *Self,
candidate: Layout,
curr_size: *u32,
curr_alignment: *std.mem.Alignment,
out_layout: *?Layout,
max_alignment_any: *std.mem.Alignment,
) void {
const size = store.layoutSize(candidate);
const alignment = candidate.alignment(store.targetUsize());
max_alignment_any.* = max_alignment_any.*.max(alignment);
if (size > curr_size.* or (size == curr_size.* and alignment.toByteUnits() > curr_alignment.*.toByteUnits())) {
curr_size.* = size;
curr_alignment.* = alignment;
out_layout.* = candidate;
}
}
}.go;
// Sort tags alphabetically by name to match interpreter's appendUnionTags ordering.
// This ensures discriminant values are consistent between evaluation and layout.
// TODO: Consider sorting tags in the type store instead for better performance,
// which would eliminate the need for sorting here and in appendUnionTags.
const tags_names = tags_slice.items(.name)[pending_tags_top..];
const tags_args_slice = tags_slice.items(.args)[pending_tags_top..];
// For each tag, compute its payload layout: () => ZST, 1 arg => layout, >1 => tuple of arg layouts
var temp_scope = TypeScope.init(self.env.gpa);
defer temp_scope.deinit();
// Create temporary array of tags for sorting
var sorted_tags = try self.env.gpa.alloc(types.Tag, num_tags);
defer self.env.gpa.free(sorted_tags);
for (tags_names, tags_args_slice, 0..) |name, args, i| {
sorted_tags[i] = .{ .name = name, .args = args };
}
for (tags_slice.items(.args), 0..) |tag_args, tag_idx| {
_ = tag_idx;
// Sort alphabetically by tag name
std.mem.sort(types.Tag, sorted_tags, self.env.getIdentStore(), types.Tag.sortByNameAsc);
// Phase 1: Compute all variant layouts first.
// This must happen BEFORE we record variants_start, because computing layouts
// for nested tag unions will recursively append to tag_union_variants.
var variant_layout_indices = try self.env.gpa.alloc(Idx, num_tags);
defer self.env.gpa.free(variant_layout_indices);
for (sorted_tags, 0..) |tag, variant_i| {
const tag_args = tag.args;
const args_slice = self.types_store.sliceVars(tag_args);
if (args_slice.len == 0) {
// No payload arguments
continue;
} else if (args_slice.len == 1) {
const arg_var = args_slice[0];
const arg_layout_idx = try self.addTypeVar(arg_var, &temp_scope);
const layout_val = self.getLayout(arg_layout_idx);
updateMax(self, layout_val, &max_payload_size, &max_payload_alignment, &max_payload_layout, &max_payload_alignment_any);
} else {
// Build tuple layout from argument layouts (including ZSTs)
variant_layout_indices[variant_i] = if (args_slice.len == 0)
// No payload - use ZST
try self.ensureZstLayout()
else if (args_slice.len == 1)
// Single arg - use its layout
// Use type_scope to look up rigid var mappings
try self.addTypeVar(args_slice[0], type_scope)
else blk: {
// Multiple args - build tuple layout
var elem_layouts = try self.env.gpa.alloc(Layout, args_slice.len);
defer self.env.gpa.free(elem_layouts);
for (args_slice, 0..) |v, i| {
const elem_idx = try self.addTypeVar(v, &temp_scope);
// Use type_scope to look up rigid var mappings
const elem_idx = try self.addTypeVar(v, type_scope);
elem_layouts[i] = self.getLayout(elem_idx);
}
const tuple_idx = try self.putTuple(elem_layouts);
const tuple_layout = self.getLayout(tuple_idx);
updateMax(self, tuple_layout, &max_payload_size, &max_payload_alignment, &max_payload_layout, &max_payload_alignment_any);
}
break :blk try self.putTuple(elem_layouts);
};
}
// Use a tuple instead of a record to avoid needing field name identifiers
// Tag unions are represented as (payload, tag) where:
// - payload is the largest payload layout (or empty record if no payloads)
// - tag is the discriminant
const payload_layout = max_payload_layout orelse blk: {
const empty_idx = try self.ensureEmptyRecordLayout();
break :blk self.getLayout(empty_idx);
};
// Phase 2: Now that all nested layouts are created, record variants_start
// and append our variant layouts. This ensures our variants are contiguous.
const variants_start: u32 = @intCast(self.tag_union_variants.len());
var element_layouts = [_]Layout{
payload_layout,
discriminant_layout,
};
const tuple_idx = try self.putTuple(&element_layouts);
// Apply maximum payload alignment if needed
if (max_payload_alignment_any.toByteUnits() > 1) {
const desired_alignment = max_payload_alignment_any;
var tuple_layout = self.getLayout(tuple_idx);
const current_alignment = tuple_layout.alignment(self.targetUsize());
if (desired_alignment.toByteUnits() > current_alignment.toByteUnits()) {
std.debug.assert(tuple_layout.tag == .tuple);
const tuple_data_idx = tuple_layout.data.tuple.idx;
const new_layout = Layout.tuple(desired_alignment, tuple_data_idx);
self.layouts.set(@enumFromInt(@intFromEnum(tuple_idx)), new_layout);
for (variant_layout_indices, 0..) |variant_layout_idx, variant_i| {
const variant_layout = self.getLayout(variant_layout_idx);
const variant_size = self.layoutSize(variant_layout);
const variant_alignment = variant_layout.alignment(self.targetUsize());
if (variant_size > max_payload_size) {
max_payload_size = variant_size;
}
max_payload_alignment = max_payload_alignment.max(variant_alignment);
// Store variant layout for runtime refcounting
_ = try self.tag_union_variants.append(self.env.gpa, .{
.payload_layout = variant_layout_idx,
});
_ = variant_i;
}
// Calculate discriminant info
const discriminant_size: u8 = if (num_tags <= 256) 1 else if (num_tags <= 65536) 2 else 4;
const discriminant_alignment: std.mem.Alignment = switch (discriminant_size) {
1 => .@"1",
2 => .@"2",
4 => .@"4",
else => unreachable,
};
// Calculate total size: payload at offset 0, discriminant at aligned offset after payload
const payload_end = max_payload_size;
const discriminant_offset: u16 = @intCast(std.mem.alignForward(u32, payload_end, @intCast(discriminant_alignment.toByteUnits())));
const total_size_unaligned = discriminant_offset + discriminant_size;
// Align total size to the tag union's alignment
const tag_union_alignment = max_payload_alignment.max(discriminant_alignment);
const total_size = std.mem.alignForward(u32, total_size_unaligned, @intCast(tag_union_alignment.toByteUnits()));
// Store TagUnionData
const tag_union_data_idx: u32 = @intCast(self.tag_union_data.len());
_ = try self.tag_union_data.append(self.env.gpa, .{
.size = total_size,
.discriminant_offset = discriminant_offset,
.discriminant_size = discriminant_size,
.variants = .{
.start = variants_start,
.count = @intCast(num_tags),
},
});
// Create and store tag_union layout
const tag_union_layout = Layout.tagUnion(tag_union_alignment, .{ .int_idx = @intCast(tag_union_data_idx) });
const tag_union_idx = try self.insertLayout(tag_union_layout);
// Break to fall through to pending container processing instead of returning directly
break :flat_type self.getLayout(tuple_idx);
break :flat_type self.getLayout(tag_union_idx);
},
.record_unbound => |fields| {
// For record_unbound, we need to gather fields directly since it has no Record struct

View file

@ -376,3 +376,354 @@ test "Repl - Str.is_empty works for empty and non-empty strings" {
defer std.testing.allocator.free(non_empty_result);
try testing.expectEqualStrings("False", non_empty_result);
}
test "Repl - List.len(Str.to_utf8(\"hello\")) should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// This expression was leaking memory
const result = try repl.step("List.len(Str.to_utf8(\"hello\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("5", result);
}
test "Repl - Str.to_utf8 returns list that should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Test Str.to_utf8 directly - the resulting list should be decreffed
const result = try repl.step("Str.to_utf8(\"hello\")");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[104, 101, 108, 108, 111]", result);
}
test "Repl - multiple Str.to_utf8 calls should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Test multiple calls in same REPL session
{
const result1 = try repl.step("List.len(Str.to_utf8(\"\"))");
defer std.testing.allocator.free(result1);
try testing.expectEqualStrings("0", result1);
}
{
const result2 = try repl.step("List.len(Str.to_utf8(\"hello\"))");
defer std.testing.allocator.free(result2);
try testing.expectEqualStrings("5", result2);
}
{
const result3 = try repl.step("List.len(Str.to_utf8(\"é\"))");
defer std.testing.allocator.free(result3);
try testing.expectEqualStrings("2", result3);
}
}
test "Repl - list literals should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Test list literals
{
const result = try repl.step("List.len([1, 2, 3])");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
{
const result = try repl.step("[1, 2, 3]");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[1, 2, 3]", result);
}
}
test "Repl - list of strings should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// List of strings - similar to what snapshot tests do
const result = try repl.step("List.len([\"hello\", \"world\", \"test\"])");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
test "Repl - from_utf8_lossy should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
{
const result = try repl.step("Str.from_utf8_lossy(Str.to_utf8(\"hello\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("\"hello\"", result);
}
}
test "Repl - for loop over list should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Simple list of strings - test that list literals are properly freed
{
const result = try repl.step("[\"hello\", \"world\", \"test\"]");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[\"hello\", \"world\", \"test\"]", result);
}
// For loop assignment - matches snapshot pattern
{
const result = try repl.step("count = { var counter_ = 0; for _ in [\"hello\", \"world\", \"test\"] { counter_ = counter_ + 1 }; counter_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `count`", result);
}
}
test "Repl - list_sort_with should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Test list_sort_with - matches the snapshot pattern
{
const result = try repl.step("List.len(List.sort_with([3, 1, 2], |a, b| if a < b LT else if a > b GT else EQ))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
{
const result = try repl.step("List.len(List.sort_with([5, 2, 8, 1, 9], |a, b| if a < b LT else if a > b GT else EQ))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("5", result);
}
}
test "Repl - list fold with concat should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// Test List.fold with List.concat - creates list literals in callback
const result = try repl.step("List.len(List.fold([1, 2, 3], [], |acc, x| List.concat(acc, [x])))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
test "Repl - all list operations should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// All list operation patterns from snapshots
{
const result = try repl.step("List.len(List.concat([1, 2], [3, 4]))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("4", result);
}
{
const result = try repl.step("List.len(List.concat([], [1, 2, 3]))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
{
const result = try repl.step("List.len(List.concat([1, 2, 3], []))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("3", result);
}
{
const result = try repl.step("List.contains([1, 2, 3, 4, 5], 3)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("True", result);
}
{
const result = try repl.step("List.drop_if([1, 2, 3, 4, 5], |x| x > 2)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[1, 2]", result);
}
{
const result = try repl.step("List.keep_if([1, 2, 3, 4, 5], |x| x > 2)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[3, 4, 5]", result);
}
{
const result = try repl.step("List.keep_if([1, 2, 3], |_| Bool.False)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("[]", result);
}
{
const result = try repl.step("List.fold_rev([1, 2, 3], 0, |x, acc| acc * 10 + x)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("321", result);
}
{
const result = try repl.step("List.fold_rev([], 42, |x, acc| x + acc)");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("42", result);
}
}
test "Repl - all for loop snapshots should not leak" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// All the for loop snapshot patterns
{
const result = try repl.step("unchanged = { var value_ = 42; for n in [] { value_ = n }; value_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `unchanged`", result);
}
{
const result = try repl.step("result = { var allTrue_ = Bool.True; for b in [Bool.True, Bool.True, Bool.False] { if b == Bool.False { allTrue_ = Bool.False } else { {} } }; allTrue_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `result`", result);
}
{
const result = try repl.step("count = { var counter_ = 0; for _ in [\"hello\", \"world\", \"test\"] { counter_ = counter_ + 1 }; counter_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `count`", result);
}
{
const result = try repl.step("sum = { var total_ = 0; for n in [1, 2, 3, 4, 5] { total_ = total_ + n }; total_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `sum`", result);
}
{
const result = try repl.step("product = { var result_ = 0; for i in [1, 2, 3] { for j in [10, 20] { result_ = result_ + (i * j) } }; result_ }");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("assigned `product`", result);
}
}
test "Repl - full list_sort_with snapshot pattern" {
// This mimics exactly what the snapshot validation does
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// All expressions from list_sort_with.md - collected first then freed
var outputs = std.array_list.Managed([]const u8).init(std.testing.allocator);
defer {
for (outputs.items) |item| {
std.testing.allocator.free(item);
}
outputs.deinit();
}
try outputs.append(try repl.step("List.len(List.sort_with([3, 1, 2], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.len(List.sort_with([5, 2, 8, 1, 9], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.len(List.sort_with([], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.len(List.sort_with([42], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.first(List.sort_with([3, 1, 2], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.first(List.sort_with([5, 2, 8, 1, 9], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.first(List.sort_with([5, 4, 3, 2, 1], |a, b| if a > b LT else if a < b GT else EQ))"));
try outputs.append(try repl.step("List.len(List.sort_with([1, 1, 1, 1], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.first(List.sort_with([1, 1, 1, 1], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.len(List.sort_with([2, 1], |a, b| if a < b LT else if a > b GT else EQ))"));
try outputs.append(try repl.step("List.first(List.sort_with([2, 1], |a, b| if a < b LT else if a > b GT else EQ))"));
try testing.expectEqualStrings("3", outputs.items[0]);
try testing.expectEqualStrings("5", outputs.items[1]);
}
test "Repl - full str_to_utf8 snapshot test" {
var test_env = TestEnv.init(std.testing.allocator);
defer test_env.deinit();
var repl = try Repl.init(std.testing.allocator, test_env.get_ops(), test_env.crashContextPtr());
defer repl.deinit();
// All expressions from str_to_utf8.md
{
const result = try repl.step("List.len(Str.to_utf8(\"\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("0", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"hello\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("5", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"é\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("2", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"🎉\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("4", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"Hello, World!\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("13", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"日本語\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("9", result);
}
{
const result = try repl.step("List.len(Str.to_utf8(\"a é 🎉\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("9", result);
}
{
const result = try repl.step("Str.from_utf8_lossy(Str.to_utf8(\"hello\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("\"hello\"", result);
}
{
const result = try repl.step("Str.from_utf8_lossy(Str.to_utf8(\"\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("\"\"", result);
}
{
const result = try repl.step("Str.from_utf8_lossy(Str.to_utf8(\"🎉 party!\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("\"🎉 party!\"", result);
}
{
const result = try repl.step("Str.from_utf8_lossy(Str.to_utf8(\"abc123\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("\"abc123\"", result);
}
{
const result = try repl.step("List.is_empty(Str.to_utf8(\"\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("True", result);
}
{
const result = try repl.step("List.is_empty(Str.to_utf8(\"x\"))");
defer std.testing.allocator.free(result);
try testing.expectEqualStrings("False", result);
}
}

View file

@ -246,11 +246,14 @@ pub const Instantiator = struct {
}
fn instantiateTuple(self: *Self, tuple: Tuple) std.mem.Allocator.Error!Tuple {
const elems_slice = self.store.sliceVars(tuple.elems);
// Use index-based iteration to avoid iterator invalidation
// (see comment in instantiateFunc for details)
var fresh_elems = std.ArrayList(Var).empty;
defer fresh_elems.deinit(self.store.gpa);
for (elems_slice) |elem_var| {
const elems_start: usize = @intFromEnum(tuple.elems.start);
for (0..tuple.elems.count) |i| {
const elem_var = self.store.vars.items.items[elems_start + i];
const fresh_elem = try self.instantiateVar(elem_var);
try fresh_elems.append(self.store.gpa, fresh_elem);
}
@ -259,11 +262,17 @@ pub const Instantiator = struct {
return Tuple{ .elems = fresh_elems_range };
}
fn instantiateFunc(self: *Self, func: Func) std.mem.Allocator.Error!Func {
const args_slice = self.store.sliceVars(func.args);
// IMPORTANT: We must use index-based iteration here, not slice-based.
// The slice would point into the backing ArrayList, but instantiateVar
// can recursively call appendVars which may reallocate the array,
// invalidating the slice pointer.
var fresh_args = std.ArrayList(Var).empty;
defer fresh_args.deinit(self.store.gpa);
for (args_slice) |arg_var| {
const args_start: usize = @intFromEnum(func.args.start);
for (0..func.args.count) |i| {
// Re-fetch the var on each iteration since the backing array may have moved
const arg_var = self.store.vars.items.items[args_start + i];
const fresh_arg = try self.instantiateVar(arg_var);
try fresh_args.append(self.store.gpa, fresh_arg);
}
@ -325,8 +334,11 @@ pub const Instantiator = struct {
var fresh_args = std.ArrayList(Var).empty;
defer fresh_args.deinit(self.store.gpa);
const args_slice = self.store.sliceVars(tag_args);
for (args_slice) |arg_var| {
// Use index-based iteration to avoid iterator invalidation
// (see comment in instantiateFunc for details)
const args_start: usize = @intFromEnum(tag_args.start);
for (0..tag_args.count) |i| {
const arg_var = self.store.vars.items.items[args_start + i];
const fresh_arg = try self.instantiateVar(arg_var);
try fresh_args.append(self.store.gpa, fresh_arg);
}
@ -358,7 +370,11 @@ pub const Instantiator = struct {
var fresh_constraints = try std.ArrayList(StaticDispatchConstraint).initCapacity(self.store.gpa, constraints.len());
defer fresh_constraints.deinit(self.store.gpa);
for (self.store.sliceStaticDispatchConstraints(constraints)) |constraint| {
// Use index-based iteration to avoid iterator invalidation
// (see comment in instantiateFunc for details)
const constraints_start: usize = @intFromEnum(constraints.start);
for (0..constraints_len) |i| {
const constraint = self.store.static_dispatch_constraints.items.items[constraints_start + i];
const fresh_constraint = try self.instantiateStaticDispatchConstraint(constraint);
try fresh_constraints.append(self.store.gpa, fresh_constraint);
}

View file

@ -210,6 +210,8 @@ pub const Store = struct {
pub fn setVarRedirect(self: *Self, target_var: Var, redirect_to: Var) Allocator.Error!void {
std.debug.assert(@intFromEnum(target_var) < self.len());
std.debug.assert(@intFromEnum(redirect_to) < self.len());
// Self-redirects cause infinite loops in resolveVar
std.debug.assert(target_var != redirect_to);
const slot_idx = Self.varToSlotIdx(target_var);
self.slots.set(slot_idx, .{ .redirect = redirect_to });
}

View file

@ -0,0 +1,13 @@
# META
~~~ini
description=List.contains checks if element is in list
type=repl
~~~
# SOURCE
~~~roc
» List.contains([1, 2, 3, 4, 5], 3)
~~~
# OUTPUT
True
# PROBLEMS
NIL