## Summary
This PR is a collaboration with @AlexWaygood from our pairing session
last Friday.
The main goal here is removing `ruff_linter::message::OldDiagnostic` in
favor of
using `ruff_db::diagnostic::Diagnostic` directly. This involved a few
major steps:
- Transferring the fields
- Transferring the methods and trait implementations, where possible
- Converting some constructor methods to free functions
- Moving the `SecondaryCode` struct
- Updating the method names
I'm hoping that some of the methods, especially those in the
`expect_ruff_*`
family, won't be necessary long-term, but I avoided trying to replace
them
entirely for now to keep the already-large diff a bit smaller.
### Related refactors
Alex and I noticed a few refactoring opportunities while looking at the
code,
specifically the very similar implementations for
`create_parse_diagnostic`,
`create_unsupported_syntax_diagnostic`, and
`create_semantic_syntax_diagnostic`.
We combined these into a single generic function, which I then copied
into
`ruff_linter::message` with some small changes and a TODO to combine
them in the
future.
I also deleted the `DisplayParseErrorType` and `TruncateAtNewline` types
for
reporting parse errors. These were added in #4124, I believe to work
around the
error messages from LALRPOP. Removing these didn't affect any tests, so
I think
they were unnecessary now that we fully control the error messages from
the
parser.
On a more minor note, I factored out some calls to the
`OldDiagnostic::filename`
(now `Diagnostic::expect_ruff_filename`) function to avoid repeatedly
allocating
`String`s in some places.
### Snapshot changes
The `show_statistics_syntax_errors` integration test changed because the
`OldDiagnostic::name` method used `syntax-error` instead of
`invalid-syntax`
like in ty. I think this (`--statistics`) is one of the only places we
actually
use this name for syntax errors, so I hope this is okay. An alternative
is to
use `syntax-error` in ty too.
The other snapshot changes are from removing this code, as discussed on
[Discord](1388252408):
34052a1185/crates/ruff_linter/src/message/mod.rs (L128-L135)
I think both of these are technically breaking changes, but they only
affect
syntax errors and are very narrow in scope, while also pretty
substantially
simplifying the refactor, so I hope they're okay to include in a patch
release.
## Test plan
Existing tests, with the adjustments mentioned above
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Extracts the vendored typeshed stubs lazily and caches them on the local
filesystem to support go-to in the LSP.
Resolves https://github.com/astral-sh/ty/issues/77.
## Summary
This PR makes the necessary changes to the server that it can request
configurations from the client using the `configuration` request.
This PR doesn't make use of the request yet. It only sets up the
foundation (mainly the coordination between client and server)
so that future PRs could pull specific settings.
I plan to use this for pulling the Python environment from the Python
extension.
Deno does something very similar to this.
## Test Plan
Tested that diagnostics are still shown.
## Summary
Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:
```
=======SALSA STRUCTS=======
`Definition` metadata=7.24MB fields=17.38MB count=181062
`Expression` metadata=4.45MB fields=5.94MB count=92804
`member_lookup_with_policy_::interned_arguments` metadata=1.97MB fields=2.25MB count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
metadata=11.46MB fields=88.86MB count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
metadata=24.52MB fields=86.68MB count=146018
`File -> ruff_db::parsed::ParsedModule`
metadata=0.12MB fields=69.06MB count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
struct metadata = 29.00MB
struct fields = 35.68MB
memo metadata = 103.87MB
memo fields = 409.06MB
```
Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.
## Summary
Garbage collect ASTs once we are done checking a given file. Queries
with a cross-file dependency on the AST will reparse the file on demand.
This reduces ty's peak memory usage by ~20-30%.
The primary change of this PR is adding a `node_index` field to every
AST node, that is assigned by the parser. `ParsedModule` can use this to
create a flat index of AST nodes any time the file is parsed (or
reparsed). This allows `AstNodeRef` to simply index into the current
instance of the `ParsedModule`, instead of storing a pointer directly.
The indices are somewhat hackily (using an atomic integer) assigned by
the `parsed_module` query instead of by the parser directly. Assigning
the indices in source-order in the (recursive) parser turns out to be
difficult, and collecting the nodes during semantic indexing is
impossible as `SemanticIndex` does not hold onto a specific
`ParsedModuleRef`, which the pointers in the flat AST are tied to. This
means that we have to do an extra AST traversal to assign and collect
the nodes into a flat index, but the small performance impact (~3% on
cold runs) seems worth it for the memory savings.
Part of https://github.com/astral-sh/ty/issues/214.
## Summary
https://github.com/astral-sh/ty/issues/214 will require a couple
invasive changes that I would like to get merged even before garbage
collection is fully implemented (to avoid rebasing):
- `ParsedModule` can no longer be dereferenced directly. Instead you
need to load a `ParsedModuleRef` to access the AST, which requires a
reference to the salsa database (as it may require re-parsing the AST if
it was collected).
- `AstNodeRef` can only be dereferenced with the `node` method, which
takes a reference to the `ParsedModuleRef`. This allows us to encode the
fact that ASTs do not live as long as the database and may be collected
as soon a given instance of a `ParsedModuleRef` is dropped. There are a
number of places where we currently merge the `'db` and `'ast`
lifetimes, so this requires giving some types/functions two separate
lifetime parameters.
## Summary
This PR unifies the ruff `Message` enum variants for syntax errors and
rule violations into a single `Message` struct consisting of a shared
`db::Diagnostic` and some additional, optional fields used for some rule
violations.
This version of `Message` is nearly a drop-in replacement for
`ruff_diagnostics::Diagnostic`, which is the next step I have in mind
for the refactor.
I think this is also a useful checkpoint because we could possibly add
some of these optional fields to the new `Diagnostic` type. I think
we've previously discussed wanting support for `Fix`es, but the other
fields seem less relevant, so we may just need to preserve the `Message`
wrapper for a bit longer.
## Test plan
Existing tests
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
This does a deeper removal of the `lint:` prefix by removing the
`DiagnosticId::as_str` method and replacing it with `as_concise_str`. We
remove the associated error type and simplify the `Display` impl for
`DiagnosticId` as well.
This turned out to catch a `lint:` that was still in the diagnostic
output: the part that says why a lint is enabled.
We just set the ID on the `Message` and it just does what we want in
this case. I think I didn't do this originally because I was trying to
preserve the existing rendering? I'm not sure. I might have just missed
this method.
In a subsequent commit, we're going to start using `annotate-snippets`'s
functionality for diagnostic IDs in the rendering. As part of doing
that, I wanted to remove this special casing of an empty message. I did
that independently to see what, if anything, would change. (The changes
look fine to me. They'll be tweaked again in the next commit along with
a bunch of others.)
## Summary
This PR is a first step toward integration of the new `Diagnostic` type
into ruff. There are two main changes:
- A new `UnifiedFile` enum wrapping `File` for red-knot and a
`SourceFile` for ruff
- ruff's `Message::SyntaxError` variant is now a `Diagnostic` instead of
a `SyntaxErrorMessage`
The second of these changes was mostly just a proof of concept for the
first, and it went pretty smoothly. Converting `DiagnosticMessage`s will
be most of the work in replacing `Message` entirely.
## Test Plan
Existing tests, which show no changes.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Micha Reiser <micha@reiser.io>
This is done in what appears to be the same way as Ruff: we get the CWD,
strip the prefix from the path if possible, and use that. If stripping
the prefix fails, then we print the full path as-is.
Fixes#17233
Summary
--
This PR extends semantic syntax error detection to red-knot. The main
changes here are:
1. Adding `SemanticSyntaxChecker` and `Vec<SemanticSyntaxError>` fields
to the `SemanticIndexBuilder`
2. Calling `SemanticSyntaxChecker::visit_stmt` and `visit_expr` in the
`SemanticIndexBuilder`'s `visit_stmt` and `visit_expr` methods
3. Implementing `SemanticSyntaxContext` for `SemanticIndexBuilder`
4. Adding new mdtests to test the context implementation and show
diagnostics
(3) is definitely the trickiest and required (I think) a minor addition
to the `SemanticIndexBuilder`. I tried to look around for existing code
performing the necessary checks, but I definitely could have missed
something or misused the existing code even when I found it.
There's still one TODO around `global` statement handling. I don't think
there's an existing way to look this up, but I'm happy to work on that
here or in a separate PR. This currently only affects detection of one
error (`LoadBeforeGlobalDeclaration` or
[PLE0118](https://docs.astral.sh/ruff/rules/load-before-global-declaration/)
in ruff), so it's not too big of a problem even if we leave the TODO.
Test Plan
--
New mdtests, as well as new errors for existing mdtests
---------
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
This PR extends version-related syntax error detection to red-knot. The
main changes here are:
1. Passing `ParseOptions` specifying a `PythonVersion` to parser calls
2. Adding a `python_version` method to the `Db` trait to make this
possible
3. Converting `UnsupportedSyntaxError`s to `Diagnostic`s
4. Updating existing mdtests to avoid unrelated syntax errors
My initial draft of (1) and (2) in #16090 instead tried passing a
`PythonVersion` down to every parser call, but @MichaReiser suggested
the `Db` approach instead
[here](https://github.com/astral-sh/ruff/pull/16090#discussion_r1969198407),
and I think it turned out much nicer.
All of the new `python_version` methods look like this:
```rust
fn python_version(&self) -> ruff_python_ast::PythonVersion {
Program::get(self).python_version(self)
}
```
with the exception of the `TestDb` in `ruff_db`, which hard-codes
`PythonVersion::latest()`.
## Test Plan
Existing mdtests, plus a new mdtest to see at least one of the new
diagnostics.
This commit shuffles the reporting API around a little bit such that a
range is required, up front, when reporting a lint diagnostic. This in
turn enables us to make suppression checking eager.
In order to avoid callers needing to provide the range twice, we create
a primary annotation *without* a message inside the `Diagnostic`
encapsulated by the guard. We do this instead of requiring the message
up front because we're concerned about API complexity and the effort
involved in creating the message.
In order to provide a means of attaching a message to the primary
annotation, we expose a convenience API on `LintDiagnosticGuard` for
setting the message. This isn't generally possible for a `Diagnostic`,
but since a `LintDiagnosticGuard` knows how the `Diagnostic` was
constructed, we can offer this API correctly.
It strikes me that it might be easy to forget to attach a primary
annotation message, btu I think this the "least" bad failure mode. And
in particular, it should be somewhat obvious that it's missing once one
adds a snapshot test for how the diagnostic renders.
Otherwise, this API gives us the ability to eagerly check whether a
diagnostic should be reported with nearly minimal information. It also
shouldn't have any footguns since it guarantees that the primary
annotation is tied to the file in the typing context. And it keeps
things pretty simple: callers only need to provide what is actually
strictly necessary to make a diagnostic.
This will enable us to provide an API on `LintDiagnosticGuard` for
setting the primary annotation message. It will require an `unwrap()`,
but due to how `LintDiagnosticGuard` will build a `Diagnostic`, this
`unwrap()` will be guaranteed to succeed. (And it won't bubble out to
every user of `LintDiagnosticGuard`.)