ruff/crates/red_knot_python_semantic
Micha Reiser 0fc4e8f795
Introduce InferContext (#14956)
## Summary

I'm currently on the fence about landing the #14760 PR because it's
unclear how we'd support tracking used and unused suppression comments
in a performant way:
* Salsa adds an "untracked" dependency to every query reading
accumulated values. This has the effect that the query re-runs on every
revision. For example, a possible future query
`unused_suppression_comments(db, file)` would re-run on every
incremental change and for every file. I don't expect the operation
itself to be expensive, but it all adds up in a project with 100k+ files
* Salsa collects the accumulated values by traversing the entire query
dependency graph. It can skip over sub-graphs if it is known that they
contain no accumulated values. This makes accumulators a great tool for
when they are rare; diagnostics are a good example. Unfortunately,
suppressions are more common, and they often appear in many different
files, making the "skip over subgraphs" optimization less effective.

Because of that, I want to wait to adopt salsa accumulators for type
check diagnostics (we could start using them for other diagnostics)
until we have very specific reasons that justify regressing incremental
check performance.

This PR does a "small" refactor that brings us closer to what I have in
#14760 but without using accumulators. To emit a diagnostic, a method
needs:

* Access to the db
* Access to the currently checked file

This PR introduces a new `InferContext` that holds on to the db, the
current file, and the reported diagnostics. It replaces the
`TypeCheckDiagnosticsBuilder`. We pass the `InferContext` instead of the
`db` to methods that *might* emit diagnostics. This simplifies some of
the `Outcome` methods, which can now be called with a context instead of
a `db` and the diagnostics builder. Having the `db` and the file on a
single type like this would also be useful when using accumulators.

This PR doesn't solve the issue that the `Outcome` types feel somewhat
complicated nor that it can be annoying when you need to report a
`Diagnostic,` but you don't have access to an `InferContext` (or the
file). However, I also believe that accumulators won't solve these
problems because:

* Even with accumulators, it's necessary to have a reference to the file
that's being checked. The struggle would be to get a reference to that
file rather than getting a reference to `InferContext`.
* Users of the `HasTy` trait (e.g., a linter) don't want to bother
getting the `File` when calling `Type::return_ty` because they aren't
interested in the created diagnostics. They just want to know what
calling the current expression would return (and if it even is a
callable). This is what the different methods of `Outcome` enable today.
I can ask for the return type without needing extra data that's only
relevant for emitting a diagnostic.

A shortcoming of this approach is that it is now a bit confusing when to
pass `db` and when an `InferContext`. An option is that we'd make the
`file` on `InferContext` optional (it won't collect any diagnostics if
`None`) and change all methods on `Type` to take `InferContext` as the
first argument instead of a `db`. I'm interested in your opinion on
this.

Accumulators are definitely harder to use incorrectly because they
remove the need to merge the diagnostics explicitly and there's no risk
that we accidentally merge the diagnostics twice, resulting in
duplicated diagnostics. I still value performance more over making our
life slightly easier.
2024-12-18 12:22:33 +00:00
..
resources Prioritize attribute in from/import statement (#15041) 2024-12-17 16:58:23 -05:00
src Introduce InferContext (#14956) 2024-12-18 12:22:33 +00:00
tests mdtest: include test name in printed rerun command (#14684) 2024-11-30 11:01:06 +00:00
build.rs Modernize build scripts (#13837) 2024-10-20 22:35:35 +01:00
Cargo.toml Introduce InferContext (#14956) 2024-12-18 12:22:33 +00:00