Variables introduced in branch patterns should never be generalized in
the new weakening model. This implements that. The strategy is:
- when we have a let-binding that should be weakened, do not introduce
its bound variables in a new (higher) rank
- instead, introduce them at the current rank, and also solve the
let-binding at the current rank
- if any of those variables should then be generalized relative to the
current rank, they will be so when the current rank is popped and
generalized
Presently while generalizing type variables, we check variables
introduced at a scope for redundancy (whether they are not the root of
some unified set of variables). If a variable is redundant, its rank is
not adjusted. I believe the current logic to be the following:
- Each root of a unification tree will be introduced at some point,
exactly once. Its point of introduction will determine the rank of the
tree it's the root of
- If a variable is redundant, all of its redundant usages must be at the
same rank (assuming let generalization proceeds correctly),
so there is no need to adjust their rank as well
- As such, there is no need to adjust the rank of redundant variables,
as a performance optimization.
I believe this to be a hold-over from the original version of the solver
derived from the elm-compiler.
In our implementation however rank adjustment is very cheap (thanks to
SoA, ranks are likely in the cache lines already anyway because we just
adjusted variables at this point).
However, there is a larger problem here - ranks must be adjusted for
redundant variables as we begin to support weakened type variables.
The motivating case is
```
\x -> when x is
_x -> Green
```
we would like this code generalized as `* -> [Green]*`. `when`
expressions have each branch solved via let-bindings; in particular, for
the singleton branch we introduce `_x` of the appropriate type and solve
the body as `[Green]*`.
Today, `[Green]*` would be generalized in the context of the inner scope
that binds `_x`, which means it is generalized in the body `\x -> ...`
as a whole.
However, with weakening, we do not want this behavior! In particular, we
do not want to actually generalize `_x` in the context of the branch
body. Doing so means you could write things like
```
main = \{} -> when Red is
x ->
y : [Red]
y = x
z : [Red, Green]
z = x
{y, z}
```
which is exactly the kind of spurious generalization that the weakening
design is trying to avoid.
So, we want to introduce `[Green]*` at the rank of the body `\x -> ...`;
let's call this `rank_body`, and let's say `[Green]*` is introduced as
`branch_var`. Let's say the return type variable is `ret_var`.
Now we must be careful. If after unification `ret_var ~ branch_var` we have that
`branch_var` becomes the root, then despite `ret_var` (and `branch_var`) being at
`rank_body` (which is also the rank that will promoted to generalization),
the tree given by `branch_var` won't be generalized, because `ret_var` will be
seen as redundant! In fact it is, because `branch_var` was introdued
previously, but that doesn't matter - we want the variable to be
generalized at the level of the outer let-binding `main = \{} -> ...`.
This problem is not unique to when-branches; for example we can observe
the same symptom with
```
main = \{} ->
x = Green
x
```
where here we'd like `x` to not be generalized inside the body of
`main`, but have it be generalized relative to the body of `main` (that
is, main should have signature `{} -> [Green]*`, but you cannot use `x`
itself polymorphically inside the body of `main`).
As such, the easiest solution as far as I can see, in the presence of
weakening, is to allow rank-adjustment and generalization of redundant
variables if they are permitted to be generalized relative to a lower
scope.
This should preserve soundness; the main source of unsoundness in
rank-based let generalization is making sure something like
```
\x ->
y = \z -> x z
y
```
has type `(a -> b) -> (a -> b)` and not e.g. `(a -> b) -> (c -> d)` due
to `x` being instantiated at a higher rank in `y = ...` than it
actually is. Note that this change cannot affect this case at all, since
we are still doing the rank-adjustment pass at higher ranks, unifying
lowers ranked variables to their minimum relative rank, and introduction
only happens in the lower-ranked scopes.
We must be careful to ensure that if unifying nested lambda sets
results in disjoint lambdas, that the parent lambda sets are
ultimately treated disjointly as well.
Consider
```
v1: {} -[ foo ({} -[ bar Str ]-> {}) ]-> {}
~ v2: {} -[ foo ({} -[ bar U64 ]-> {}) ]-> {}
```
When considering unification of the nested sets
```
[ bar Str ]
~ [ bar U64 ]
```
we should not unify these sets, even disjointly, because that would
ultimately lead us to unifying
```
v1 ~ v2
=> {} -[ foo ({} -[ bar Str, bar U64 ]-> {}) ] -> {}
```
which is quite wrong - we do not have a lambda `foo` that captures
either `bar captures: Str` or `bar captures: U64`, we have two
different lambdas `foo` that capture different `bars`. The target
unification is
```
v1 ~ v2
=> {} -[ foo ({} -[ bar Str ]-> {}),
foo ({} -[ bar U64 ]-> {}) ] -> {}
```
Closes#4712
The current type inference scheme is such that we first introduce the
types for annotation functions, then check their bodies without
additional re-generalization. As part of generalization, we also perform
occurs checks to fix-up recursive tag unions.
However, type annotations can contain type inference variables that are
neither part of the generalization scheme, nor are re-generalized later
on, and in fact end up forming a closure of a recursive type. If we do
not catch and break such closures into recursive types, things go bad
soon after in later stages of the compiler.
To deal with this, re-introduce the values of recursive values after we
check their definitions, forcing an occurs check. This introduction is
benign because we already generalized appropriate type variables anyway.
Though, the introduction is somewhat unnecessary, and I have ideas on
how to make all of this simpler and more performant. That will come in
the future.
When constraining a recursive function like
```
f : _ -> {}
f : \_ -> f {}
```
our first step is to solve the value type of `f` relative to its
annotation. We have to be careful that the inference variable in the
signature of `f` is not generalized until after the body of `f` is
solved. Otherwise, we end up admitting polymorphic recursion.
If a specialization of an ability member has a lambda set that is not
reflected in the unspecialized lambda sets of the member's prototype
signature, then the specialization lambda set is deemed to be immaterial
to the specialization lambda set mapping, and we don't need to associate
it with a particular region from the prototype signature.
This can happen when an opaque contains functions that are some specific
than the generalized prototype signature; for example, when we are
defining a custom impl for an opaque with functions.
Addresses a bug found in 8c3158c3e0
With fixpoint-fixing, we don't want to re-unify type variables that were
just fixed, because doing so may change their shapes in ways that we
explicitly just set them up not to be changed (as fixpoint-fixing
clobbers type variable contents).
However, this restriction need only apply when we re-unify two type
variables that were both involved in the same fixpoint-fixing cycle. If
we have a type variable T that was involved in fixpoint-fixing, and we
unify it with U that wasn't, we know that the $U \notin \bar{T}$, where
$\bar{T}$ is the recursive closure of T. In these cases, we do want to
permit the usual in-band unification of $T \sim U$.
When we attempt to bind a type argument to an ability in an alias/opaque
instantiation, we create a fresh flex var to represent satisfaction of
the ability, and then unify the type argument with that flex var.
Previously, we did not register this fresh var in the appropriate rank
pool.
Usually this is not a problem; however, our generalization algorithm is
such that we skip adjusting the rank of redundant variables. Redundant
variables are those that are in the same unification tree, but are not
the root of the unification trees.
This means that if such a flex able var becomes the root of a
unification tree with the type argument, and the type argument is itself
generalized, we will have missed generalization of the argument.
The fix is simple - make sure to register the flex able var into the
appropriate rank pool.
Closes#4408
Prior to this commit, we emplace type variables into `Index<TypeTag>`
only for translated top-level types. However, we need to be careful to
do this emplacement for nested types as well! This patch ensures we do,
but immediately emplacing the destination variable of a type when we
allocate a variable for it.