Commit graph

1681 commits

Author SHA1 Message Date
Ruud van Asseldonk
3e23f70b09 Record new tree-sitter-rcl v0.12.0 ref 2025-12-17 21:26:59 +01:00
Ruud van Asseldonk
c7267505c3 Bump version to 0.12.0 2025-12-17 21:01:13 +01:00
Ruud van Asseldonk
647affd25e Rephrase support section on website
It feels like GitHub's decline is in full motion, let's de-emphasize it
a bit and put other forges on an equal footing.
2025-12-17 20:40:16 +01:00
Ruud van Asseldonk
72b04ea691 Replace website tagline
One commenter on Lobste.rs pointed out that the "make X Y again" can
have political connotations. That was not the intent, and I don't want
to scare people away with that.

I brainstormed several options, in the end this one captures the same
spirit ("boring", it should not create problems or cause surprises, it
should not feel like solving puzzles, building things is exciting,
configuring them should be boring), and even makes the goal a bit
clearer by hinting at the remove-duplication part.
2025-12-17 20:33:05 +01:00
Ruud van Asseldonk
2a88d9aed9 Write changelog entry for Set.map/flat_map changes 2025-12-17 19:28:26 +01:00
Ruud van Asseldonk
258f0e2384 Rename Set.{map,flat_map}, add _dedup suffix
The map function in particular is a footgun that hit me twice in two
weeks, in a case similar to this one:

    {"apple", "orange", "banana"}.map(x => x.len()).sum()

This looks like it sums the lengths of the strings. Except it doesn't,
it sums the *unique* lengths of the strings, so we get 11 instead of 17.
That's surprising, and I don't want that kind of surprise in RCL.

At first I thought to remove Set.map and Set.flat_map entirely. You
would just have to convert to a list first and map there, and if you
wanted the set, convert back with List.to_set_dedup, which makes it
explicit. That's why I added these conversion functions. Then I
implemented deletion ... and as I was writing the PR description and
changelog entry for how to handle the break, it dawned on me, if we
have the _dedup anyway in to_set_dedup, we might as well use it here.
We don't have to delete these methods, we can just call them _dedup.

    {"apple", "orange", "banana"}.map_dedup(x => x.len()).sum()

Now the snippet is no longer so sneaky! It's clear that there is
deduplication going on! So I'm fine keeping them in this form. At least
in the Nix transitive closure test, the map_dedup behavior is what I
want, which validates demand for at least map_dedup. But even then,
a changelog entry for a breaking change that says "renamed to X" is a
much better experience than "have been deleted here are some much longer
alternatives".

Also, this change is relatively safe to make: it will break existing
callers, but they will get an error message and there is a clear path
to fixing this. It doesn't silently change the output of something: it's
being called on sets, so there can't even be fields named 'map' or
'flat_map' because sets don't have fields. So I feel comfortable making
a small breaking change here.
2025-12-17 19:28:26 +01:00
Ruud van Asseldonk
ca519a4ff8 Note that Set.to_list is identical to Set.sort
I may have forgotten about Set.sort when I added Set.to_list ... It
feels bit weird to have two identical methods. On the other hand, they
convey very different intent, and using .sort() for places where you
e.g. only want to convert to be able to use a non-deduplicating map, but
where you don't care about order, .to_list() feels like it matches that
intent better than .sort(). So I'll keep both for now, but we can at
least acknowledge it, in case people wonder.
2025-12-17 19:25:56 +01:00
Ruud van Asseldonk
48f276232f Merge methods for conversion between List and Set
I want to delete Set.map and Set.flat_map, because they are footguns.
To provide at least some alternative useful in method chaining (which is
useful in ad-hoc queries on the command line, where it is inconvenient
to move the cursor back to add an opening paren for a comperhension),
and because they are useful in general, add some conversion methods.

Naming things for the conversion from List to Set went through many
iterations (see 0fca8c5056 for details), and I'm still not sure
this is the right way to go about it. Maybe we should have `List.dedup:
(self: List[T]) -> List[T]` instead, and call the one that errors on
duplicates simply `List.to_set: (self: List[T]) -> Set[T]`, so you
can do `.dedup().to_set()` to explicitly drop duplicates, if that's
the intent? But `to_set` does not look at the call site like it would
validate anything, so it's no good for readability in *that* sense …

Anyway, nothing is set in stone, we can just rename things later! Just
like I can delete Set.map if it turns out to be a mistake!
2025-12-14 20:23:15 +01:00
Ruud van Asseldonk
47d4d64f91 Add remark about sortedness of Set.to_list
It is sorted now, it will probably remain like that for a while, but I'm
still playing with the idea of making dicts preserve insertion order,
and if I do that, then I would probably also make sets preserve order,
and the sorting property will go away.

If a curious property like this is not documented, people start
wondering, is it *always* sorted? And the answer is yes -- at least for
now. So probably better to document that. I have no illusions that a
note like this will prevent people from depending on it, but that's
fine, if you update you do that consciously and deliberately, I hope.
2025-12-14 20:03:12 +01:00
Ruud van Asseldonk
4be227797a Add goldens to cover new List and Set methods 2025-12-14 19:50:20 +01:00
Ruud van Asseldonk
0fca8c5056 Add List.{to_set_dedup,to_set_unique} methods
Naming these was a long bikeshed. Some considerations:

 * Symmetry with Set.to_list and String.to_uppercase, keep the to_.

 * Naming after an action is clearer than naming after a property,
   because for a property it's not clear whether that is a precondition
   rather than a postcondition. xs.dedup() makes it clear it removes
   duplicates, xs.unique(), does it assert it's unique? Does it test it?

 * In light of the above observation, List.sort is a good name, better
   than List.sorted.

 * The obvious one people will try is probably to_set. If it silently
   drops duplicates, that's an unsafe default. So let's make this the
   one that validates, and we can have to_set_dedup that drops
   duplicates.

 * Erroring on duplicates is always safe. We can even point the users to
   the version that drops duplicates.

 * If you read code and see xs.to_set() without any contrasting
   .to_set_dedup(), it's not so obvious that .to_set() does any
   validation, and if that is your intent, then you have to clarify
   that in a comment.

 * xs.to_set_assert_unique in that sense is clearer, but it's too
   verbose in practice. xs.to_set_nodup is a bit obscure, especially if
   you don't see the contrasting .to_set_dedup().

 * A method List.dedup that returns a list, and having to_set being the
   one that errors, also works, then you can do .dedup().to_set(), which
   is very explicit. It still does not clarify at the call site that
   to_set is validating though. It is also inefficient, but that's
   probably not a problem for RCL, and you can always use a
   comprehension if it matters.

So conclusion for now, to_set is the right prefix for symmetry and
discoverability. There is not going to be a "default" to set, you have
to explicitly choose whether you want to discard duplicates or not. The
to_set_dedup is clear for that. When reading code, to_set_unique is
maybe less obvious, but it gives more hint than to_set, and if you think
for a moment, set values are already unique, so why the _unique, and
then you find the contrast with _dedup. So far this seems like one of
the better directions.
2025-12-14 19:31:41 +01:00
Ruud van Asseldonk
8a45379cbf Add Set.to_list method
I plan to remove Set.map and Set.flat_map, and to make that slightly
more palatable, let's add Set.to_list, so you can at least replace all
the former calls with .to_list().map(...).
2025-12-14 19:31:41 +01:00
Ruud van Asseldonk
711e3191d5 Improve unpack and comprehension documentation 2025-12-14 19:31:20 +01:00
Ruud van Asseldonk
2750db3188 Allow comments after '=>' in functions
This is another papercut I ran into while working on Advent of Code
exercises. If I recall correctly, previously comments were not allowed
there because there was no good place in the AST to put them, but that
has since been fixed, with parse_expr handling prefixes, so simply
removing the part that skips over noncode does the trick now!
2025-12-13 16:00:04 +01:00
Ruud van Asseldonk
4cfab711e5 Fix formatting of collections that contain unpacks
The formatter used to add a space at the end. That was an oversight when
I added the unpack feature. Fortunately it's an easy fix.

I added the golden test first to confirm it would have caught the bug.
Also, it turns out there was a different fmt golden tests that already
encoded the bad space. I missed it in review.
2025-12-13 15:35:54 +01:00
Ruud van Asseldonk
74ea91ef5d Improve handling of unops inside binops
I did a few of the Advent of Code exercises in RCL, and one papercut was
having to use parentheses in `if x >= (-1)`. On top of that, the error
message was not intuitive ("Expected expression"), so if you omitted
them, it wouldn't be immediate clear what is wrong. The grammar was
like this with the intent to avoid confusion about operator precedence.
For example:

    // Is this 'not (a and b)' or '(not a) and b'?
    not a and b

There is only confusion though, when a binary operator follows a unary
one. We already had a great error message for this, but only for unary
operators at the top level of an expression, not when they occurred
after a binary operator, e.g. in this case:

    a and not b or c

It turns out we can allow both trailing unops inside binops (the `x >=
-1` case), while also improving the error message for an unop in the
middle, which is what this change implements.
2025-12-13 15:19:49 +01:00
Ruud van Asseldonk
08a7299367 Improve handling of binop inside unop
It turns out, we can:

 * Allow more cases, including the common case of e.g. 'x > -1', without
   creating ambiguities.
 * Still make ambiguities such as 'a and not b or c' an error.
 * Improve the error message for that case, over my previous one!

And this is a tiny change to the grammar and parser, yay! It feels like
one of those times where I discover the *right* way to do it. Discover,
in the platonic sense.
2025-12-13 15:16:40 +01:00
Ruud van Asseldonk
e84edabafa Add custom error message for unop inside binop
This situation can be confusing without parens, that's why it's not
allowed, even though it makes cases like

    if x < -10: ...

invalid and we have to write (-10). But at least we can tell users to
use parens.
2025-12-13 12:53:42 +01:00
Ruud van Asseldonk
23645ff9e4 Limit nesting depth more aggressively in fuzz mode
The fuzzer has discovered the concept of "exponential growth" and now
calls Set.transitive_closure with a function that at each iteration
creates a dict with *four* closures that each refer to the previous'
iteration's value. This feels like whack-a-mole, I can't make the depth
too small either. But let's see if maybe there is a sweet spot that is
acceptable.
2025-12-13 12:49:39 +01:00
Ruud van Asseldonk
20136b7f24 Prevent hangs from deeply nested values in recursion
This is a hang discovered by the fuzzer. In the initial form, it
discovered it from a closure (where the environment capture made it
grow), but the same applies to lists. Putting those deeply nested values
in BTreeSets triggers some pathological worst-case comparison where
every comparison has to descend all the way.

This applies to all functions that can recursively build values. The
fuzzer discovered it for transitive_closure but in fact we can trigger
similar behavior with fold (which has never been found by the fuzzer!).
So guard that one as well.

You can still manually do recursion, but there is little we can do about
that, RCL is not really meant for it so serves you right if you trigger
it, though of course at some point the fuzzer is going to learn about
this and then we will have to mitigate it ...
2025-12-12 23:00:32 +01:00
Ruud van Asseldonk
ba3a3e54f0 Merge addition of 'Set.transitive_closure' method
This adds Set.transitive_closure, which is useful for e.g. finding all
store paths in 'nix flake archive --json' output, but it also comes in
very handy in Advent of Code problems!
2025-12-09 22:43:09 +01:00
Ruud van Asseldonk
d9a1eaa77c Add transitive_closure builtin to dictionaraies 2025-12-09 22:30:22 +01:00
Ruud van Asseldonk
01ae86aaeb Document Set.transitive_closure 2025-12-09 22:29:34 +01:00
Ruud van Asseldonk
854ee51bad Add goldens for Set.transitive_closure 2025-12-09 21:45:49 +01:00
Ruud van Asseldonk
c6ff12d055 Add Set.transitive_closure method
This is useful in at least one case I ran into in practice when
inspecting Nix flake json output, where flakes contain other flakes and
I want their transitive closure.

It is also usefel for code golf for Advent of Code :)
2025-12-09 21:29:25 +01:00
Ruud van Asseldonk
67e718b999 Ban the use of LLMs in the issue tracker
I already had a line in place to discourage LLM-generated *code*, but
recently I see things popping up where accounts are posting what looks
like LLM-generated messages in pull requests and issue trackers.

I saw a comment pop up now, I seriously wonder, was this a human who
copy-pasted ChatGPT output, or do people give Claude Code access to 'gh'
with a valid token, and do they let agents autonomously post comments
upstream on their behalf?

Whatever it was, it's a waste of everybody's time, and I want nothing to
do with this in my repositories.
2025-12-03 20:01:17 +01:00
Ruud van Asseldonk
4d4d14e439 Accept and silence some Clippy advice
I previously only ran 'cargo clippy', without '--workspace'. Now that
I included that, some more advice popped up. Most of them are false
positives, but I accepted one.
2025-11-23 20:45:14 +01:00
Ruud van Asseldonk
5a27965dbc Use new unpack syntax on homepage example 2025-11-23 20:45:05 +01:00
Ruud van Asseldonk
31db3a01d3 Record new tree-sitter-rcl v0.11.0 ref 2025-11-23 20:27:21 +01:00
Ruud van Asseldonk
44d43a64dc Bump version to 0.11.0 2025-11-23 19:26:00 +01:00
Ruud van Asseldonk
287bde6f59 Update project status to reflect reality
It *is* pre-1.0 and it *does* occasionally have backwards-incompatible
changes, but they are well documented with clear upgrade paths, and not
happening at a breakneck pace. For example, with the introduction of
unpack, the |-operator is deprecated, but it still works. Or take the
assertion syntax: it changed, but the old syntax is still accepted, and
the autoformatter automatically upgrades documents. So I think that the
previous phrase was a bit too conservative. Other "mature" projects
cause me breakage headaches far more often than RCL. The readme should
reflect that.
2025-11-23 18:50:56 +01:00
Ruud van Asseldonk
a81e470281 Move development guid to docs/development.md
I want to make the readme more aimed at people interested in the tool
and project in general, not necessarily at developers hacking on RCL
itself. Let's move that to a separate manual section, since there is
already a development section in there.
2025-11-23 18:14:18 +01:00
Ruud van Asseldonk
69f1a990f0 Use unpack syntax to simplify uniqueness example 2025-11-23 12:17:07 +01:00
Ruud van Asseldonk
b255b17db9 Merge support for unpack
Problems
========

 * Combining collections through comprehensions is too verbose.
   I don’t want to write {for x in xs: x, for y in ys: y}.

 * Pipe (|) for set and dict union is very ad-hoc in the typechecker,
   and binary operators are awkward to format.

Solution
========

We can solve both problems by adding unpack: the ability to splice
a collection into another one. Like * and ** in Python, and ... in
Javascript.

 * It’s a short way to splice in collections (of which concat is a
   special case).

 * Dict and set union are now completely regular and easily formattable.

 * Arguably, the meaning of .. and ... is more self-evident than the
   meaning of |, so it makes code more readable to newcomers.

Design space and alternatives not taken
=======================================

I would prefer to have just .. for unpack, and it would unpack both
single elements (scalars) from lists and sets, as well as key-value
pairs from dicts. However, that means that you can no longer
syntactically tell whether {..xs} is a dict or a set, which previously
was possible (except for the empty {}, which we defined to be a dict,
like Python). There are several ways to deal with this:

Distinct unpack syntax

    Distinguish single-element and key-value unpack syntactically, e.g.
    .. for scalar and ... for key-value. This is what Python does too,
    with * vs. **.

Distinct set literal syntax

    Use a distinct syntax for sets. { ... } would be reserved for dicts,
    sets would use e.g. {| ... |}, set{ ... }, {{ }}, etc. It would
    simplify many things, right now it’s kind of messy that we don't
    know what a {}-literal is until we peek inside.

Delete sets entirely

    As with using distinct syntax, this removes many complications,
    including much of the near duplication between lists and sets. Who
    needs sets anyway? We can just use lists instead, and add a few
    convenience methods to make things unique. We can even distinguish
    between “drop duplicates” and “assert that it’s already unique”,
    which a set in RCL currently can’t do, constructing one can only
    drop duplicates. Though nothing prevents adding a method on list
    that asserts elements are unique and returns a set. Sets are a bit
    like unsigned integers, they add some value by making invalid states
    unrepresentable, they allow slightly more precise types, but we can
    survive without them, and in practice not much is lost.

    I did implement this option. It did simplify things a lot. It felt
    like a shame to delete so much code, including some rather neat
    parts, but that’s just sunk cost fallacy. But in the end, I think
    that there is value in having sets. Many things are naturally sets
    rather than lists. No other configuration language offers sets,
    which gives RCL a unique advantage in this regard. Or maybe there is
    a good reason none of the others have sets. Though Python does, and
    it also reuses the {}-syntax for dicts and sets.

    It also simplifies the type system tremendously, because now we
    no longer have two collection types that have an inner element type
    (list and set), and it sidesteps the mess around kind of needing a
    Collection[T] supertype, but then how to deal with Collection[T]
    vs. Union[List[T], Set[T]], how to document it and explain when it
    occurs in type errors, etc.

    So I tried it, but it felt too harsh, I want to try and make unpack
    work without deleting sets. This dead end is not part of the history
    included with this merge commit.

Give up on static typechecking

    Maybe it’s fine if we don’t know whether {..xs} is a dict or a
    set? That minimal example is pathological, but union is a common
    operation, and losing the type for {..xs, ..ys} would be a shame.

In the end, the option implemented is to have the distinct unpack
syntax.

This was a big feature to write, primarily due to complications for the
typechecker, and the many cases to handle. Most of the dead ends are not
part of the history merged in this merge commit. There is an entire
other branch about adding a Collection[T] type that is now obsolete as
well.
2025-11-22 23:11:51 +01:00
Ruud van Asseldonk
24a4d85240 Remove todo in scalar unpack error message
It is still kind of true, here is a reproducer:

    let xs = [1, 2]; { a = 1, ..xs }
    stdin:1:27
      ╷
    1 │ let xs = [1, 2]; { a = 1, ..xs }
      ╵                           ^~~~
    Error: Invalid unpack in dict.

    Help: '..' unpacks lists and sets, use '...' to unpack dicts.

I don't think the help is that bad though, the error does say "invalid
unpack" and the help explains that there are ".." and "...", the fact
that "xs" is a list and ".." is the right thing to unpack a list, well
yeah but what do you expect unpacking a list into a dict to do? So I
think it's fine, let's not complicate the already complex error
reporting logic any further.
2025-11-22 22:52:18 +01:00
Ruud van Asseldonk
1a21667e04 Give incorrect unpack error precedence match
Previously, writing [...xs] would give a type error when xs was not a
dict, but really we should complain about dict unpack in the list. The
new match order fixes that.
2025-11-22 22:51:55 +01:00
Ruud van Asseldonk
75990841f3 Turn todo into deliberately unimplemented feature
It can still be refined in the future, but I don't expect it will really
need to be done. Let's not pollute the "git grep TODO" output with
nice-to-haves.
2025-11-22 22:19:39 +01:00
Ruud van Asseldonk
2335323ba4 Resolve todo in reporting type diff 2025-11-22 22:16:31 +01:00
Ruud van Asseldonk
a189fd9bbf Document unpack in the Bison grammar 2025-11-22 22:06:50 +01:00
Ruud van Asseldonk
21ee28b9ac Move Bison usage from readme to development docs
I want to make the readme more geared towards RCL users rather than
people hacking on RCL, this is one small step in that direction.
2025-11-22 22:05:16 +01:00
Ruud van Asseldonk
36a970fbe5 Include unpack in fuzz dictionary 2025-11-22 22:02:29 +01:00
Ruud van Asseldonk
73d030a967 Make the changelog more compact again
This is not going to be *that* big of a release that it needs all those
sections, I hope.
2025-11-22 21:48:14 +01:00
Ruud van Asseldonk
3008b99b41 De-emphasize deprecated |-operator in docs and examples 2025-11-22 21:44:36 +01:00
Ruud van Asseldonk
1c099f84c6 Add goldens for unpack runtime type checks 2025-11-22 21:33:25 +01:00
Ruud van Asseldonk
68527db263 Add runtime type check for scalar unpack if needed
This completes the typechecking of unpacks!
2025-11-22 21:33:25 +01:00
Ruud van Asseldonk
2ac0d1de8b Insert runtime typecheck for dict unpack if needed
In cases where the collection in '...collection' can't be statically
verified to be a dict, we need to do a runtime check. We can change the
expected type for the collection though.

Unfortunately this makes the error messages in case of a mismatch on
the type a bit more obscure. I think that at this point, that's an
acceptable cost to keep the implementation simpler.

This commit only handles dicts, but also lays the groundwork for
checking scalar unpack. For those, since it can be a list or set,
we need to move the typecheck to runtime, because there is no
single expected type to feed into the check. Well, we could feed
in Union[List[T], Set[T]], but the union is then conjured up by the
typechecker, which could make errors more difficult to understand, so
let's just do a runtime check.
2025-11-22 21:33:25 +01:00
Ruud van Asseldonk
f54e9a6d51 Add more goldens for unpack inference
This brings line coverage of typecheck.rs back to 100%.
2025-11-22 21:33:24 +01:00
Ruud van Asseldonk
82f5dde775 Explain set unpack errors, add goldens 2025-11-22 21:33:24 +01:00
Ruud van Asseldonk
120a2749c6 Add goldens for type mismatch with unpack source 2025-11-22 21:33:24 +01:00
Ruud van Asseldonk
b3e8dad5cd Also highlight collection in unpack errors
After creating the goldens, pointing at only the .. or ... does seem
a bit arcane, I think it's better to highlight the entire unpack
expression.
2025-11-22 21:33:24 +01:00