* Unify parsing of string literals and scalar literals, to (e.g.) ensure escapes are handled uniformly. Notably, this makes unicode escapes valid in scalar literals.
* Add a variety of custom error messages about specific failure cases of parsing string/scalar literals. For example, if we're expecting a string (e.g. a package name in the header) and the user tried using single quotes, give a clear message about that.
* Fix formatting of unicode escapes (they previously used {}, now correctly use () to match roc strings)
* test_fmt moves out of fmt crate
* test_parse _mostly_ moves out of parse crate and into `test_snapshots.rs` (some simple tests remain)
* now there's only two fuzz targets, fuzz_expr and fuzz_module, that cover both parsing and formatting
* added a system to auto-add new snapshot entries for new test files
* took some commented-out tests in `test_parse` and converted them to snapshot tests
* moved test_fmt's verification of formatting consistency into test_snapshots
* fixed a huge derp on my part where the fmt fuzzer in #4758 was completely useless (broken by refactoring just prior to submitting the PR)
* fixed a formatting bug found by fuzzing (bound_variable.expr.roc) - that I missed earlier due to ^^^ that derp
* no longer have roc_test_utils as a dependency in fmt - which was causing problems for the wasm build
This commit adds fuzzing for the (expr) formatter, with the same invariants that we use for fmt tests:
* We start with text, which we parse
* We format the AST, which must succeed
* We parse back the AST and make sure it's identical igoring whitespace+comments
* We format the new AST and assert it's equal to the first formatted version ("idempotency")
Interestingly, while a lot of bugs this found were in the formatter, it also found some parsing bugs.
It then fixes a bunch of bugs that fell out:
* Some small oversights in RemoveSpaces
* Make sure `_a` doesn't parse as an inferred type (`_`) followed by an identifier (parsing bug!)
* Call `extract_spaces` on a parsed expr before matching on it, lest it be Expr::SpaceBefore - when parsing aliases
* A few cases where the formatter generated invalid/different code
* Numerous formatting bugs that caused the formatting to not be idempotent
The last point there is worth talking further about. There were several cases where the old code was trying to enforce strong
opinions about how to insert newlines in function types and defs. In both of those cases, it looked like the goals of
(1) idempotency, (2) giving the user some say in the output, and (3) these strong opinions - were often in conflict.
For these cases, I erred on the side of following the user's existing choices about where to put newlines.
We can go back and re-add this strong opinionation later - but this seemed the right approach for now.
* The header + expr fuzzers can both be run again (header fuzzer had regressed).
* I ran the expr fuzzer for ~60 seconds with no additional panics uncovered
* "tab_crash" hit supposedly unreachable code in blankspace.rs - and I went to the liberty of dramatically simplifying all that code, rather than just trying to fix the bug
* Other failures were straight-forward error cases that should have been handled (and passed up the chain) instead of panicking
Also change some tests with newly relaxed indentation requirements, and remove an irrelevant test (since unindented close parens are now perfectly valid, the test is no longer useful).
As previously discovered with #4464, it's easy to accidentally mis-use the State value returned on the Err path.
There were mixed assumptions about what that State represents: (1) the State where the error occurred, or (2) the State at the beginning of the thing we were just parsing.
I fixed this up to always mean (2) - at which point we don't actually need to return the State at all - so it's impossible for further discrepency to creep in.
I also took the liberty to refactor a few more methods to be purely combinator-based, rather than calling `parse` directly.
This removes the need to explicitly pass thru min_indent when using the parser combinators.
My ultimate goal here is to evolve the current parser closer toward a purely combinator-based parser,
at which point we can more easily transition smoothly to a formal(ish) grammar, or expand the meanings of combinators
to include things like:
* Incremental (re)parsing
* Unified parsing and formatting code
* Better error recovery
* Using the main parser directly for syntax highlighting