This is a bit.. ugly, or at least seems suboptimal, but I can't think of
a better way to do it currently aside from demanding a uniform
representation, which we probably don't want to do.
Another option is something like the defunctionalization we perform
today, except also capturing potential uses of nested functions in the
closure tag of an encompassing lambda. So for example,
```
f = \x -> \y -> 1
```
would now record a lambdaset with the data `[Test.f
[TypeOfInnerClos1]]`, where `TypeOfInnerClos1` is e.g.
`[Test.f.innerClos1 I8, Test.f.innerClos1 I16]`, symbolizing that the
inner closure may be specialized to take an I8 or I16. Then at the time
that we create the capture set for `f`, we create a tag noting what
specialization should be used for the inner closure, and apply the
current defunctionalization algorithm. So effectively, the type of the
inner closure becomes a capture.
I'm not sure if this is any better, or if it has more problems.
@folkertdev any thoughts?
Closes#2322
This code has a shadowing error:
```
b = False
f = \b -> b
f b
```
but prior to this commit, the compiler would hit an internal error
during monomorphization and not even get to report the error. The reason
was that when we entered the closure `\b -> b`, we would try to
introduce the identifier `b` to the scope, see that it shadows an
existing identifier, and not insert the identifier. But this meant that
when checking the body of `\b -> b`, we would think that we captured the
value `b` in the outer scope, but that's incorrect!
The present patch fixes the issue by generating new symbols for
shadowing identifiers, so deeper scopes pick up the correct reference.
This also means in the future we may be able to compile and execute code
with shadows, even though it will still be an error.
Closes#2343
This commit fixes a long-standing bug wherein bindings to polymorphic,
non-function expressions would be lowered at binding site, rather than
being specialized at the call site.
Concretely, consider the program
```
main =
n = 1
idU8 : U8 -> U8
idU8 = \m -> m
idU8 n
```
Prior to this commit, we would lower `n = 1` as part of the IR, and the
`n` at the call site `idU8 n` would reference the lowered definition.
However, at the definition site, `1` has the polymorphic type `Num *` -
it is not until the the call site that we are able to refine the type
bound by `n`, but at that point it's too late. Since the default layout
for `Num *` is a signed 64-bit int, we would generate IR like
```
procedure main():
let App.n : Builtin(Int(I64)) = 1i64;
...
let App.5 : Builtin(Int(U8)) = CallByName Add.idU8 App.n;
ret App.5;
```
But we know `idU8` expects a `u8`; giving it an `i64` is nonsense.
Indeed this would trigger LLVM miscompilations later on.
To remedy this, we now keep a sidecar table that maps symbols to the
polymorphic expression they reference, when they do so. We then
specialize references to symbols on the fly at usage sites, similar to
how we specialize function usages.
Looking at our example, the definition `n = 1` is now never lowered to
the IR directly. We only generate code for `1` at each place `n` is
referenced. As a larger example, you can imagine that
```
main =
n = 1
asU8 : U8 -> U8
asU32 : U32 -> U8
asU8 n + asU32 n
```
is lowered to the moral equivalent of
```
main =
asU8 : U8 -> U8
asU32 : U32 -> U8
asU8 1 + asU32 1
```
Moreover, transient usages of polymorphic expressions are lowered
successfully with this approach. See for example the
`monomorphized_tag_with_polymorphic_arg_and_monomorphic_arg` test in
this commit, which checks that
```
main =
mono : U8
mono = 15
poly = A
wrap = Wrapped poly mono
useWrap1 : [Wrapped [A] U8, Other] -> U8
useWrap1 =
\w -> when w is
Wrapped A n -> n
Other -> 0
useWrap2 : [Wrapped [A, B] U8] -> U8
useWrap2 =
\w -> when w is
Wrapped A n -> n
Wrapped B _ -> 0
useWrap1 wrap * useWrap2 wrap
```
has proper code generated for it, in the presence of the polymorphic
`wrap` which references the polymorphic `poly`.
https://github.com/rtfeldman/roc/pull/2347 had a different approach to
this - polymorphic expressions would be converted to (possibly capturing) thunks.
This has the benefit of reducing code size if there are many polymorphic
usages, but may make the generated code slower and makes integration
with the existing IR implementation harder. In practice I think the
average number of polymorphic usages of an expression will be very
small.
Closes https://github.com/rtfeldman/roc/issues/2336
Closes https://github.com/rtfeldman/roc/issues/2254
Closes https://github.com/rtfeldman/roc/issues/2344
Previously we would expand optional record fields to assignments when
converting record patterns to "when" expressions. This resulted in
incorrect code being generated.
Figuring out what this module was doing, and why, took me a bit less
than half an hour. We should document what's happening for others in the
future so they don't need to follow up on Zulip necessarily.
This work is related to restricting tag union sizes in input positions.
As an example, for something like
```
\x -> when x is
A M -> X
A N -> X
A _ -> X
```
we'd like to infer `[A [M, N]* ]` rather than the `[A, [M, N]* ]*` we
infer today. Notice the difference is that the former type tells us we
only accepts `A`s, but the argument of the `A` can be `M`, `N` or
anything else (hence the `_`).
So what's the idea? It's an encoding of the "must have"/"might have"
design discussed in https://github.com/rtfeldman/roc/issues/1758. Let's
take our example above and walk through unification of each branch.
Suppose `x` starts off as a flex var `t`.
```
\x -> when x is
A M -> X
```
Now we introduce a new kind of constraint called a "presence"
constraint. It says "t has at least [A [M]]". I'll notate this as `t +=
[A [M]]`. When `t` is free as it is here, this is equivalent to `t ~
[A [M]]`.
```
\x -> when x is
...
A N -> X
```
At this branch we introduce the presence constraint `[A [M]] += [A [N]]`.
Notice that there's two tag unions we care about resolving here - one is
the toplevel one that says "I have an `A ...` inside of me", and the
other one is the tag union that's the tyarg to `A`. They are distinct
and at different depths.
For the toplevel one, we first figure out if the number of tags in the
union needs to expand. It does not - we're hoping to resolve the type
`[A [M, N]]`, which only has `A` in the toplevel union. So, we don't
need to do anything extra there, other than the merge the nested tag
unions.
We recurse on the shared tags, and now we have the presence constraint
`[M] += [N]`. At this point it's important to remember that the left and
right hand types are backed by type variables, so this is really
something like `t11 [M] += t12 [N]`, where `[M]` and `[N]` are just what
we know the variables `t11` and `t12` to be at this moment. So how do we
solve for `t11 [M, N]` from here? Well, we can encode this constraint as
a type variable definition and a unification constraint we already know
how to solve:
```
New definition: t11 [M]a (a fresh)
New constraint: a ~ t12 [N]
```
That's it; upon unification, `t11 [M, N]` falls out.
Okay, last step.
```
\x -> when x is
...
A _ -> X
```
We now have `[A [M, N]] += [A a]`, where `a` is a fresh unbound
variable. Again nothing has to happen on the toplevel. We walk down and
find `t11 [M, N] += t21 a`. This is actually called an "open constraint"; we
differentiate it at the time we generate constraints because it follows
syntactically from the presence of an `_`, but it's semantically
equivalent to the presence constraint `t11 [M, N] += t21 a`. It's just
called opening because literally the only way `t11 [M, N] += t21 a` can
be true is if we set `t11 a`. Well, actually, we assume `a` is a tag
union, so we just make `t11` the open tag union `[M, N]a`. Since `a` is
unbound, this eventually becomes a wildcard and hence falls out `[M, N]*`.
Also, once we open a tag union with an open constraint, we never close
it again.
That's it. The rest falls out recursively. This gives us a really easy
way to encode these ordering constraints in the unification-based system
we have today with minimal additional intervention. We do have to patch
variables in-place sometimes, and the additive nature of these
constraints feels about out-of-place relative to unification, but it
seems to work well.
Resolves#1758
This was referenced in the `List` documentation and in the
[tutorial](./TUTORIAL.md), but wasn't actually implemented prior to this
commit!
Part of #2227
By emitting a runtime error rather than panicking when we can't layout
a record, we help programs like
```
main =
get = \{a} -> a
get {b: "hello world"}
```
execute as
```
Mismatch in compiler/unify/src/unify.rs Line 1071 Column 13
Trying to unify two flat types that are incompatible: EmptyRecord ~ { 'a' : Demanded(122), }<130>
🔨 Rebuilding host...
── TYPE MISMATCH ───────────────────────────────────────────────────────────────
The 1st argument to get is not what I expect:
8│ get {b: "hello world"}
^^^^^^^^^^^^^^^^^^
This argument is a record of type:
{ b : Str }
But get needs the 1st argument to be:
{ a : a }b
Tip: Seems like a record field typo. Maybe a should be b?
Tip: Can more type annotations be added? Type annotations always help
me give more specific messages, and I think they could help a lot in
this case
────────────────────────────────────────────────────────────────────────────────
'+fast-variable-shuffle' is not a recognized feature for this target (ignoring feature)
'+fast-variable-shuffle' is not a recognized feature for this target (ignoring feature)
Done!
Application crashed with message
Can't create record with improper layout
Shutting down
```
rather than the hanging
```
Mismatch in compiler/unify/src/unify.rs Line 1071 Column 13
Trying to unify two flat types that are incompatible: EmptyRecord ~ { 'a' : Demanded(122), }<130>
thread '<unnamed>' panicked at 'invalid layout from var: UnresolvedTypeVar(104)', compiler/mono/s
rc/layout.rs:1510:52
```
that was previously produced.
Part of #2227
+ Evidently I failed to finish fixing merge conflicts
+ Some of the types that the SingleQuote code mentioned didn't exist according to the build step. I looked around and switched them out for types it LOOKED like they were supposed to be, but someone should probably check this
+ Don't make commits like this; it's multiple unrelated changes thrown together. I'm still figuring out my way around here