Basically, this splits the implementation into two pieces:
the first piece does the traversal and finds *all* symbols
across the workspace. The second piece does filtering based
on a user provided query string. Only the first piece is
cached by Salsa.
This brings warm "workspace symbols" requests down from
500-600ms to 100-200ms.
While this doesn't typically matter, when ty returns a very
large list of symbols, this can have an impact. Specifically,
when searching `async` in home-assistant, this gets times
closer to 500ms versus closer to 600ms before this change.
It looks like an overall ~50ms improvement (so around 10%),
but variance is all over the place and I didn't do any
statistical tests.
But this does make intuitive sense. Previously, we were
allocating intermediate strings, doing UTF-8 decoding and
consulting Unicode casing tables. Now we're just doing what
is likely a single DFA scan. In effect, we front load all
of the Unicode junk into regex compilation.
There is a small amount of subtlety to this matching routine,
and it could be implemented in a faster way. So let's right some
tests for what we have to ensure we don't break anything when
we optimize it.
## Summary
Looks like an oversight at some point that led to two identical globals,
the one in `ty_project` just calls `ty_python_semantic::register_lints`.
## Summary
Removes the `module_ptr` field from `AstNodeRef` in release mode, and
change `NodeIndex` to a `NonZeroU32` to reduce the size of
`Option<AstNodeRef<_>>` fields.
I believe CI runs in debug mode, so this won't show up in the memory
report, but this reduces memory by ~2% in release mode.
## Summary
Previously we held off from doing this because we weren't sure that it
was worth the added complexity cost. But our code has changed in the
months since we made that initial decision, and I think the structure of
the code is such that it no longer really leads to much added complexity
to add precise inference when unpacking a string literal or a bytes
literal.
The improved inference we gain from this has real benefits to users (see
the mypy_primer report), and this PR doesn't appear to have a
performance impact.
## Test plan
mdtests
## Summary
We use the `System` abstraction in ty to abstract away the host/system
on which ty runs.
This has a few benefits:
* Tests can run in full isolation using a memory system (that uses an
in-memory file system)
* The LSP has a custom implementation where `read_to_string` returns the
content as seen by the editor (e.g. unsaved changes) instead of always
returning the content as it is stored on disk
* We don't require any file system polyfills for wasm in the browser
However, it does require extra care that we don't accidentally use
`std::fs` or `std::env` (etc.) methods in ty's code base (which is very
easy).
This PR sets up Clippy and disallows the most common methods, instead
pointing users towards the corresponding `System` methods.
The setup is a bit awkward because clippy doesn't support inheriting
configurations. That means, a crate can only override the entire
workspace configuration or not at all.
The approach taken in this PR is:
* Configure the disallowed methods at the workspace level
* Allow `disallowed_methods` at the workspace level
* Enable the lint at the crate level using the warn attribute (in code)
The obvious downside is that it won't work if we ever want to disallow
other methods, but we can figure that out once we reach that point.
What about false positives: Just add an `allow` and move on with your
life :) This isn't something that we have to enforce strictly; the goal
is to catch accidental misuse.
## Test Plan
Clippy found a place where we incorrectly used `std::fs::read_to_string`
Adds a method to `TStringValue` to detect whether the t-string is empty
_as an iterable_. Note the subtlety here that, unlike f-strings, an
empty t-string is still truthy (i.e. `bool(t"")==True`).
Closes#19951
## Summary
Rename `TypeAliasType::Bare` to `TypeAliasType::ManualPEP695`, and
`BareTypeAliasType` to `ManualPEP695TypeAliasType`.
Why?
Both existing variants of `TypeAliasType` are specific to features added
in PEP 695 (which introduced both the `type` statement and
`types.TypeAliasType`), so it doesn't make sense to name one with the
name `PEP695` and not the other.
A "bare" type alias, in my mind, is a legacy type alias like `IntOrStr =
int | str`, which is "bare" in that there is nothing at all
distinguishing it as a type alias. I will want to use the "bare" name
for this variant, in a future PR.
The renamed variant here describes a type alias created with `IntOrStr =
types.TypeAliasType("IntOrStr", int | str)`, which is not "bare", it's
just "manually" instantiated instead of using the `type` statement
syntax sugar. (This is useful when using the `typing_extensions`
backport of `TypeAliasType` on older Python versions.)
## Test Plan
Pure rename, existing tests pass.
## Summary
This PR fixes https://github.com/astral-sh/ty/issues/1071
The core issue is that `CallableType` is a salsa interned but
`Signature` (which `CallableType` stores) ignores the `Definition` in
its `Eq` and `Hash` implementation.
This PR tries to simplest fix by removing the custom `Eq` and `Hash`
implementation. The main downside of this fix is that it can increase
memory usage because `CallableType`s that are equal except for their
`Definition` are now interned separately.
The alternative is to remove `Definition` from `CallableType` and
instead, call `bindings` directly on the callee (call_expression.func).
However, this would require
addressing the TODO
here
39ee71c2a5/crates/ty_python_semantic/src/types.rs (L4582-L4586)
This might probably be worth addressing anyway, but is the more involved
fix. That's why I opted for removing the custom `Eq` implementation.
We already "ignore" the definition during normalization, thank's to
Alex's work in https://github.com/astral-sh/ruff/pull/19615
## Test Plan
https://github.com/user-attachments/assets/248d1cb1-12fd-4441-adab-b7e0866d23eb
While implementing similar logic for initializers I noticed that this
code appeared to be walking the ancestors in the wrong direction, and so
if you have nested function calls it would always grab the outermost one
instead of the closest-ancestor.
The four copies of the test are because there's something really evil in
our caching that can't seem to be demonstrated in our cursor testing
framework, which I'm filing a followup for.
Summary
--
This is a preparatory PR in support of #19919. It moves our `Diff`
rendering code from `ruff_linter` to `ruff_db`, where we have direct
access to the `DiagnosticStylesheet` used by our other diagnostic
rendering code. As shown by the tests, this shouldn't cause any visible
changes. The colors aren't exactly the same, as I note in a TODO
comment, but I don't think there's any existing way to see those, even
in tests.
The `Diff` implementation is mostly unchanged. I just switched from a
Ruff-specific `SourceFile` to a `DiagnosticSource` (removing an
`expect_ruff_source_file` call) and updated the `LineStyle` struct and
other styling calls to use `fmt_styled` and our existing stylesheet.
In support of these changes, I added three styles to our stylesheet:
`insertion` and `deletion` for the corresponding diff operations, and
`underline`, which apparently we _can_ use, as I hoped on Discord. This
isn't supported in all terminals, though. It worked in ghostty but not
in st for me.
I moved the `calculate_print_width` function from the now-deleted
`diff.rs` to a method on `OneIndexed`, where it was available everywhere
we needed it. I'm not sure if that's desirable, or if my other changes
to the function are either (using `ilog10` instead of a loop). This does
make it `const` and slightly simplifies things in my opinion, but I'm
happy to revert it if preferred.
I also inlined a version of `show_nonprinting` from the
`ShowNonprinting` trait in `ruff_linter`:
f4be05a83b/crates/ruff_linter/src/text_helpers.rs (L3-L5)
This trait is now only used in `source_kind.rs`, so I'm not sure it's
worth having the trait or the macro-generated implementation (which is
only called once). This is obviously closely related to our unprintable
character handling in diagnostic rendering, but the usage seems
different enough not to try to combine them.
f4be05a83b/crates/ruff_db/src/diagnostic/render.rs (L990-L998)
We could also move the trait to another crate where we can use it in
`ruff_db` instead of inlining here, of course.
Finally, this PR makes `TextEmitter` a very thin wrapper around a
`DisplayDiagnosticsConfig`. It's still used in a few places, though,
unlike the other emitters we've replaced, so I figured it was worth
keeping around. It's a pretty nice API for setting all of the options on
the config and then passing that along to a `DisplayDiagnostics`.
Test Plan
--
Existing snapshot tests with diffs
"Why would you do this? This looks like you just replaced `bool` with an
overly complex trait"
Yes that's correct!
This should be a no-op refactoring. It replaces all of the logic in our
assignability, subtyping, equivalence, and disjointness methods to work
over an arbitrary `Constraints` trait instead of only working on `bool`.
The methods that `Constraints` provides looks very much like what we get
from `bool`. But soon we will add a new impl of this trait, and some new
methods, that let us express "fuzzy" constraints that aren't always true
or false. (In particular, a constraint will express the upper and lower
bounds of the allowed specializations of a typevar.)
Even once we have that, most of the operations that we perform on
constraint sets will be the usual boolean operations, just on sets.
(`false` becomes empty/never; `true` becomes universe/always; `or`
becomes union; `and` becomes intersection; `not` becomes negation.) So
it's helpful to have this separate PR to refactor how we invoke those
operations without introducing the new functionality yet.
Note that we also have translations of `Option::is_some_and` and
`is_none_or`, and of `Iterator::any` and `all`, and that the `and`,
`or`, `when_any`, and `when_all` methods are meant to short-circuit,
just like the corresponding boolean operations. For constraint sets,
that depends on being able to implement the `is_always` and `is_never`
trait methods.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
Part of: https://github.com/astral-sh/ty/issues/868
This PR adds a heuristic to avoid argument type expansion if it's going
to eventually lead to no matching overload.
This is done by checking whether the non-expandable argument types are
assignable to the corresponding annotated parameter type. If one of them
is not assignable to all of the remaining overloads, then argument type
expansion isn't going to help.
## Test Plan
Add mdtest that would otherwise take a long time because of the number
of arguments that it would need to expand (30).
This is a fairly simple but effective way to add docstrings to like 95%
of completions from initial experimentation.
Fixes https://github.com/astral-sh/ty/issues/1036
Although ironically this approach *does not* work specifically for
`print` and I haven't looked into why.
## Summary
Resolves#19561
Fixes the [unnecessary-future-import
(UP010)](https://docs.astral.sh/ruff/rules/unnecessary-future-import/)
rule to correctly identify when imported __future__ modules are actually
used in the code, preventing false positives.
I assume there is no way to check usage in `analyze::statements`,
because we don't have any usage bindings for imports. To determine
unused imports, we have to fully scan the file to create bindings and
then check usage, similar to [unused-import
(F401)](https://docs.astral.sh/ruff/rules/unused-import/#unused-import-f401).
So, `Rule::UnnecessaryFutureImport` was moved from the
`analyze::statements` to the `analyze::deferred_scopes` stage. This
caused the need to change the logic of future import handling to a
bindings-based approach.
Also, the diagnostic report was changed.
Before
```
|
1 | from __future__ import nested_scopes, generators
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UP010
```
after
```
|
1 | from __future__ import nested_scopes, generators
| ^^^^^^^^^^^^^ UP010
```
I believe this is the correct way, because `generators` may be used, but
`nested_scopes` is not.
### Special case
I've found out about some specific case.
```python
from __future__ import nested_scopes
nested_scopes = 1
```
Here we can treat `nested_scopes` as an unused import because the
variable `nested_scopes` shadows it and we can safely remove the future
import (my fix does it).
But
[F401](https://docs.astral.sh/ruff/rules/unused-import/#unused-import-f401)
not triggered for such case
([sandbox](https://play.ruff.rs/296d9c7e-0f02-4659-b0c0-78cc21f3de76))
```
from foo import print_function
print_function = 1
```
In my mind, `print_function` here is an unused import and should be
deleted (my IDE highlight it). What do you think?
## Test Plan
Added test cases and snapshots:
- Split test file into separate _0 and _1 files for appropriate checks.
- Added test cases to verify fixes when future module are used.
---------
Co-authored-by: Igor Drokin <drokinii1017@gmail.com>
This commit corrects the type checker's behavior when handling
`dataclass_transform` decorators that don't explicitly specify
`field_specifiers`. According to [PEP 681 (Data Class
Transforms)](https://peps.python.org/pep-0681/#dataclass-transform-parameters),
when `field_specifiers` is not provided, it defaults to an empty tuple,
meaning no field specifiers are supported and
`dataclasses.field`/`dataclasses.Field` calls should be ignored.
Fixes https://github.com/astral-sh/ty/issues/980
This basically splits `list_modules` into a higher level "aggregation"
routine and a lower level "get modules for one search path" routine.
This permits Salsa to cache the lower level components, e.g., many
search paths refer to directories that rarely change. This saves us
interaction with the system.
This did require a fair bit of surgery in terms of being careful about
adding file roots. Namely, now that we rely even more on file roots
existing for correct handling of cache invalidation, there were several
spots in our code that needed to be updated to add roots (that we
weren't previously doing). This feels Not Great, and it would be better
if we had some kind of abstraction that handled this for us. But it
isn't clear to me at this time what that looks like.
This ensures there is some level of consistency between the APIs.
This did require exposing a couple more things on `Module` for good
error messages. This also motivated a switch to an interned struct
instead of a tracked struct. This ensures that `list_modules` and
`resolve_modules` reuse the same `Module` values when the inputs are the
same.
Ref https://github.com/astral-sh/ruff/pull/19883#discussion_r2272520194
This makes `import <CURSOR>` and `from <CURSOR>` completions work.
This also makes `import os.<CURSOR>` and `from os.<CURSOR>`
completions work. In this case, we are careful to only offer
submodule completions.
These tests were added as a regression check that a panic
didn't occur. So we were asserting a bit more than necessary.
In particular, these will soon return completions for modules,
which creates large snapshots that we don't need.
So modify these to just check there is sensible output that
doesn't panic.
The actual implementation wasn't too bad. It's not long
but pretty fiddly. I copied over the tests from the existing
module resolver and adapted them to work with this API. Then
I added a number of my own tests as well.
Previously, if the module was just `foo-stubs`, we'd skip over
stripping the `-stubs` suffix which would lead to us returning
`None`.
This function is now a little convoluted and could be simpler
if we did an intermediate allocation. But I kept the iterative
approach and added a special case to handle `foo-stubs`.
These tests capture existing behavior.
I added these when I stumbled upon what I thought was an
oddity: we prioritize `foo.pyi` over `foo.py`, but
prioritize `foo/__init__.py` over `foo.pyi`.
(I plan to investigate this more closely in follow-up
work. Particularly, to look at other type checkers. It
seems like we may want to change this to always prioritize
stubs.)