I am not really happy with this, because the type is not really
something that end-users should use (although they can), it's more for
the typechecker internally. How to explain this nicely?
It would be nice to have that, but the current implementation is not
incorrect, so let's do the easy thing for now, we can always make it
better later.
At first I wasn't sure if I need to add Collection here, and how to
report an inner mismatch. But now it's clear, the TypeDiff should match
the expected supertype.
It already worked before, but we would only report the mismatch
afterwards, and not blame inside, maybe even insert a runtime check.
Propagating the expecation inside results in better errors and more
efficient code.
I want to add an unpack operator, and to be able to type the type
expectation, I need a collection supertype for List and Set. Maybe
once I have this, I can actually type the union operator in a more
elegant way as well and I no longer need unpack. But unpack seems
nice anyway, and a Collection supertype seems nice as well.
For a long time, the names of builtins and types were duplicated in
many places, which is annoying to update every time, and it’s easy to
forget one. This happened in one instance: the Vim plugin was missing
the enumerate builtin. This generates most of them from the Pygments
grammar. There is still some duplication, in particular the definitions
in the RCL source. I think that's fine for now. At some point it may
be nice to add an auxiliary binary that outputs all method names, to
automate that part. This also changes the smith fuzzer, in a way that
future changes to builtins should be less disruptive to the fuzz corpus.
The module name and path hack throws off Mypy, in different ways
depending on how you run it (outside Nix shell, inside Nix develop
shell, or as part of the flake check ...). Let's just forward-declare
it and sidestep problems.
Maybe I should make this script the source of truth and generate the
Pygments grammar from it. We could do that later, for now this works.
I should do it, and if I don't do it now then I probably will not do it
soon. On the other hand, no need to let the perfect be the enemy of the
good; calling the script manually is already much better than having to
update all the places manually. So let's merge it as it is.
This corrects one bug in the plugin there that went unnoticed: the
'enumerate' builtin was not highlighted previously. So that's a win
for generating things!
This moves one more place of duplication of builtins into
generate_keywords.py as a single source of truth, resolving
a to do in the smith fuzzer.
This does once more shuffle all of these around in the fuzzer, which
makes the existing fuzz corpus mostly meaningless. Fortunately, this
should be the last time that this happens: with the new approach we
can modify the builtins with minimal changes to the meaning of the
fuzz corpus, which is something that I wanted for a long time.
I regularly add new methods, and it's becoming tedious to have to
remember to update all the places that reference these, so let's
generate them and automate the process. For now, I'm choosing the
Pygments grammar as the source of truth, and the first target to
generate is the fuzz dictionary.
I'm leaving the Zed extension pointing to the older commit of the
Tree-sitter grammar, I'll update that after this version bump. It's
a bit awkward to do it this way around, but there are circular
dependencies that can't be avoided. Maybe with an attack on SHA1 it
can be done in theory, but let's not go there.
We don't need to overflow when d is very negative, it just rounds to
zero. Identified after staring at the coverage report for a bit and
thinking deeply about it.
This was discovered by inspecting the coverage report. The branch where
the mantissa didn't need adjustment was not yet covered, and then I
realized that returning the number itself was wrong. It was right in
the past, before I incorporated the exponent, but then I decided that
rounding should get rid of any exponent, and that broke this branch.
This also adds coverage for that branch.
At first I also wanted to support rounding to a negative number of
decimals (so rounding to a positive power of 10), but scope creep,
complications ... I don't need it, and we can always add that later.
This bug was discovered by the fuzz_source fuzzer. A similar construct
is used for subtracting the number of decimals, but that one is not
affected, because the number of decimals is an u8, it can't be negative,
so d1.decimals - d2.decimals is never greater than d1.decimals.
Now that I added this "Playground" link, it doesn't fit on one line any
more on mobile. Maybe it needs a hamburger menu but that also doesn't
make that much sense with so few items. For now, I am very sorry
Codeberg, but probably fewer people click that link than people who want
the GitHub repo. The Codeberg repo is still linked further down.
RCL is now truly -- for all practical purposes, yeah yeah pedantics
surrogate pairs and a file with 20 GiB of zeros are technically valid
json but let's talk about documents used in the real world -- a json
superset!
Also I think I should try to make the readme and index pages a bit more
attractive to people who discover this. I wrote them from my niche
perspective and I had a lot of background about what I was building, but
probably it needs to be explained more to new users.
Also improve a few other things, e.g. as a quick hack, add a
"Playground" link in the website header to make the feature more
discoverable. We can extract it into a separate page at a later stage.
These were helpful for initial exploration, but are confusing to have
lying around at this point. For example, they pollute the results when
I was 'git grep'ing for remaining references to Int.
The type system has long since been implemented, there is documentation
and a companion blog posts series about its design and implementation.
The idea directory is still there in the Git history for the future
historians, but it has no place any more in the default branch.
This is the result of a long journey, the early commits in this chapter
have been rebased quite a few times. On top of that, the direction that
the early commits take (adding Float side by side with Int), I later
changed course on that. So the history of this is long and turbulent.
The diff of this merge commit though, if you ignore changes to the
golden tests, is not that bad. The actual changes to the Rust code are
pretty straightforward. The diff is large mostly due to the rename of
Int to Number which affects a *lot* of golden tests.
I chose to preserve the original history, rather than to break it up
into parts that make more sense in hindsight, partially because I think
there is value in preserving it, and partially because it's already open
for a long time and I don't want to block it any longer.
For context of how the number type came to be, I have a companion blog
post in the pipeline that is also linked from the documentation.
It doesn't apply, the comment was wrong. If there is an overflow, and
the signs are both positive, then m1 really is the greater one. I think
I didn't realize before that they are both positive? Anyway, nothing to
do to address this then.
I'm thinking if there is a test to add, but the logic was just wrong,
and there is already a test case that covers this control flow path, so
I think we're good.
Again, the formatter doesn't care about type names, so these tests were
not failing before, but still, to make the tests a good example, and to
not cause confusion with outdated code that doesn't work, let's rename
things there too.
This is mostly a mechanical rename, but because the formatter cares
about length, in one case I had to change the example to make it still
exercise the same case.