When I initially converted from the "if-then-else" keyword syntax to the
colon-based one, during development I put the colon in, then later I
removed it again, but I was still unsure. Now, after having used the
syntax for some time, my feeling is that there should be a colon after
all. Nim got it right. So put it back.
Because it's easy to stay backwards compatible here, make the colon
optional. We can make it mandatory at some point in the future, but even
making the autoformatter put it there is probably a strong enough push.
Now that full expressions are no longer allowed in some places, the
changelog mentions how to fix it, but we can actually put the help
straight into the code. And then it's helpful in all cases where you
try to use an expression and the keyword would be valid if you added
paretheses.
This change is similar to the one for conditionals. Like them, it stems
from the formatter printing a raw space before the collection, which is
problematic for non-code prefixes. Only here it did not surface as a non-
idempotency in the formatter because no non-code is allowed before the
collection. Still I think if you want to write this:
[for x in let y = [1, 2, 3]; y: y * 2]
Then instead you should write this:
[for x in (let y = [1, 2, 3]; y): y * 2]
It's much clearer and it creates fewer problems with formatting.
The fuzzer now discovered a non-idempotency in the formatter, in case
there is a non-code prefix for the condition. This has something to do
with the space between "if" and the condition, there is no separator
there, whatever follows goes on the same line, which is usually not the
case.
For evaluation everything works fine, but how do you format this? We can
try to repair it, but it's hard. A solution that sidesteps all this is
to restrict what kind of expressions we can have after an if. Just don't
allow statements and ifs there. We don't lose any expressivity, if you
want that it still works, just put parens around it. With parens it is
also possible to format the expression properly, e.g.
if (
// Comments are fine, everything is indented here, etc.
condition
):
then_value
else
else_value
Oh, and unrelated, I think I am convinced that I want the colon after
"else" back. But let's do that in a follow-up. Or maybe it can be
optional but the formatter always puts it there?
The implementation of this forces propagating spans in more places,
which ended up being a drive-by fix for one place where spans were
computed incorrectly. This fix shows up in the golden tests.
Hmm, at this point its not unanimously more elegant. When parsing a
sequence of things that may have trailing non-code, it was nice to parse
the non-code and then look at the separator. Now we have to peek over it
instead. Alternatively, I could parse it, and have a way to pass it in
to parse_expr as "seed non-code", but that is also a bit clumsy. For now
the "peek past" will do.
Wow! My change to make a statement expr be the top-level one that
optionally includes a prefix was a great discovery! Everything becomes
much simpler now! No need to store those prefixes separately everywhere
any more! I should have done this much earlier. And all of the golden
tests still pass with this change, it's almost magical. Also a good
demonstration of how something that ends up looking simple may not be
simple to discover.
I suspect that a similar transformation is possible with what is
currently Prefixed<Seq>, I'll look into that next.
This changes the CST to keep a list of statements instead of making it
a degenerate tree that nests deeper for every statement. The primary
reason for doing this is to enable better pretty-printing by formatting
either the entire chain as wide or tall, and not breaking up a chain of
let bindings or assertions where some are on a line and some are not.
This is quite a deep change in one sense, but the code changes ended
up being smaller than I expected. It also ends up enabling non-code
prefixes in more places, so I think this is a good change in general.
I need to audit the parser and CST because I think there are now a few
places where a prefix is stored separately that would now be parsed into
an Expr::Statements node instead.
But what surprises me most, after I got everything to compile, all the
golden tests still pass aside from a few reformattings, and the new
format looks universally better than the old one. Wow! I think I really
discovered the "right" way to implement this!
I want to implement Python/Black's magic trailing comma, so a first step
is to store whether it's there. This makes my elements/suffix tuple even
larger, so let's make that a proper struct. That is a bit of an invasive
change unfortunately.
We do this to enable formatting a chain of field access in its entirety
as wide or tall. In the formatter, individual field expressions no
longer create groups, there is one group for the chain.
For now we do keep the tree structure (that in practice degenerates to
a singly linked list) in the AST, because it would be an invasive change
to make that a list too, and I'm not sure that is worth it. (It would
help somewhat to avoid stack overflow, but it may complicate the
typechecker and evaluator.) So the abstractor translates the list back
into a tree.
This finishes ("finishes") the refactor to move to the new expectation
system. Some things are still not implemented, but at least the
old-style checks are now completely replaced.
This also changes some of the spans that the errors get reported to, in
a way that I am now happy with it. Errors in let bindings should be
blamed on the value span, not on the identifier.
I didn't want to go down the path of unifying all the elements of alist
type, because I fear it will be expensive when loading large json
documents with many big and deep dicts. Inferring those will lead to
huge types, and we can't even share the Rc instance if they are inferred
all the time.
BUT, let's not worry about performance right now. Setting that concern
aside, it is very tempting to just do the inference. And then a lattice
naturally falls out. Maybe in the end I end up removing some of this
again, but just having the lattice will probably lead to a better
design.
Then an open question: I have meet, which is enough for primitive types
and for ??variant types, (I never remember, is it covariant or
contravariant?), which List and Dict and Set are in RCL, because you can
only get stuff out, not put stuff in. Function return types also behave
like that. But then for function arguments I need join. And join of two
incompatible things is Void. Which makes sense in a statically typed
setting. If I do e.g.
let fs = [
// (Int) -> Int
x => x + 1,
// (Bool) -> Bool
x => not x
]
Then in general an element of 'fs' is a function that I cannot call
because the input would need to be both Int and Bool. But at runtime, I
don't call a functino "in general", I call a particular one, which can
be fine. How to type that? I think that if any of the arguments
collapsed to Void, we need to instead make it Dynamic, and check at
runtime. But that is the complete opposite of what the lattice does! It
feels inelegant and like a hack. So I'm not sure about that yet, but
let's see when we get to it, maybe I'm missing something obvious and
I'll have a better idea then.
Saying that a Debug impl is not covered is counterproductive. It is not
expected to be covered when code is correct. So exclude these regions,
because a drop from 100% coverage to <100% stands out a lot more than a
drop from 97.8% to 97.2% when I do add one line that is not covered but
might be an important edge case.
There were some cases where non-code before a closing delimiter could be
eaten, because we consumed it before discovering the closing delimiter.
This happens in the real world, mostly when you have a collection and
you comment out the last element. To fix this, put the non-code in the
CST and emit it when formatting.
I had a prefixed type at the top-level let previously, but it leads to
a bad case that breaks idempotency in the formatter (see parent commit).
It is possible to remediate this while preserving prefixes, but that
would add complexity, and why would do you even want to have comments
between the : and the type? I allowed the prefixes in type lists, in
generic instantiations and in function types, because there I you might
want to document the types, e.g.
let f: (
// The number of widgets.
Int,
// The maximum widget serial number.
Int,
) -> Int = (n, m) => ...
But between the : and the type ... then you should just put the comment
above the let. So we simplify the CST, and fix a bug!
After having used RCL in practice for a few months, I think I'm leaning
towards this. It resolves an awkward way of formatting the if-then-else
multi-line, and it's more consistent with 'if' inside a comprehension.
It also makes the syntax resemble Python more for the multi-line case.
What was holding me back previously is that I think the colons on a
single line look kind of awkward, e.g.
let x = if cond: true-val else: false-val;
But I expect I will get used to it. What pushed me over the edge was
that Nim uses this syntax. Though Nim is maybe not the best
justification to cite here, from my very limited experience using it,
it looks like a kitchen sink of syntax, which is a valid point in the
design space (Perl and Raku have fans too after all) but the opposite
of what I want RCL to be.
But I do think this change brings more consistency and regularity. RCL
is more colon/delimiter based, and less keyword-based. No 'begin end'
but '[]', and no 'then' but ':' fits with that.
It makes things difficult to format. Instead, you should place the
comment before the lambda. Or put parens around the lambda. It's a bit
unfortunate, but let's go with it for now.
I'm not so happy with the way the hanging works, but to get rid of that,
I think I will have to remove the ability to add comments on the body.
Or maybe change the way prefixed_expr gets converted into a doc.
User-facing, I think it is a bit friendlier to call them "function" in
error messages and such. And if I do that, I should be consistent and
call them functions everywhere.
This is hairy, I need to keep the environment alive, and also keep the
AST alive. Now the AST bleeds into runtime values, and stuff like Env
now needs to be Ord for values to be Ord, etc.
This was a bit of an adventure, to get the grammar right. See also a
discussion of the ideas in ideas/lambdas.rcl added in this commit. For
now I will plow through with the syntax that I like best, let's see if
it works out, we can always change it later.
Previously, with expr_import at the same level as statement-like
expressions, it was not part of expr_op, and inside a sequence, only
expr_op is used, which meant that you couldn't put imports in a
sequence, at least not without parentheses. By pushing expr_import more
inwards to be part of expr_op, it is now allowed inside sequences.
Also add a test for this.
I don't think this is really an improvement, but I don't have a very
strong opinion on it and silencing Clippy is as ugly, so let's just do
it.
Also, where I previously had to silence Clippy (for the same lint), I
did end up adderssing that, and the silences are no longer needed.
When the file could not be loaded, I would like to highlight the
argument of the call that is the path. But to be able to do that, we
need to thread the spans through everywhere.
Why reasonable?
* Humans can easily reason about what a given expression will evaluate
to (unlike yaml).
* It's a retroactive change (or addition) to the name "RCL": Reasonable
Configuration Language.
* Calling RCL sane is subjective anyway, and it implies that other
options may be insane. Although if I change it to "unreasonable",
maybe I am calling others unreasonable.
I am still ambivalent about this. On the one hand I have a strong sense
that "key = value;" is a statement that needs a terminator. On the other
hand, it makes things more uniform to have only a single separator.
I think I just need to get used to the comma, and then I will not mind
so much. But let's see.