## Summary
This applies the trick that we use for `builtins.open` to similar
functions that have the same problem. The reason is that the problem
would otherwise become even more pronounced once we add understanding of
the implicit type of `self` parameters, because then something like
`(base_path / "test.bin").open("rb")` also leads to a wrong return type
and can result in false positives.
## Test Plan
New Markdown tests
The names of the submodules returned should be *complete*. This
is the contract of `Module::name`. However, we were previously
only returning the basename of the submodule.
This re-works the `all_symbols` based added previously to work across
all modules available, and not just what is directly in the workspace.
Note that we always pass an empty string as a query, which makes the
results always empty. We'll fix this in a subsequent commit.
This is to facilitate recursive traversal of all modules in an
environment. This way, we can keep asking for submodules.
This also simplifies how this is used in completions, and probably makes
it faster. Namely, since we return the `Module` itself, callers don't
need to invoke the full module resolver just to get the module type.
Note that this doesn't include namespace packages. (Which were
previously not supported in `Module::all_submodules`.) Given how they
can be spread out across multiple search paths, they will likely require
special consideration here.
This is similar to a change made in the "list top-level modules"
implementation that had been masked by poor Salsa failure modes.
Basically, if we can't find a root here, it *must* be a bug. And if we
just silently skip over it, we risk voiding Salsa's purity contract,
leading to more difficult to debug panics.
This did cause one test to fail, but only because the test wasn't
properly setting up roots.
## Summary
We use the `System` abstraction in ty to abstract away the host/system
on which ty runs.
This has a few benefits:
* Tests can run in full isolation using a memory system (that uses an
in-memory file system)
* The LSP has a custom implementation where `read_to_string` returns the
content as seen by the editor (e.g. unsaved changes) instead of always
returning the content as it is stored on disk
* We don't require any file system polyfills for wasm in the browser
However, it does require extra care that we don't accidentally use
`std::fs` or `std::env` (etc.) methods in ty's code base (which is very
easy).
This PR sets up Clippy and disallows the most common methods, instead
pointing users towards the corresponding `System` methods.
The setup is a bit awkward because clippy doesn't support inheriting
configurations. That means, a crate can only override the entire
workspace configuration or not at all.
The approach taken in this PR is:
* Configure the disallowed methods at the workspace level
* Allow `disallowed_methods` at the workspace level
* Enable the lint at the crate level using the warn attribute (in code)
The obvious downside is that it won't work if we ever want to disallow
other methods, but we can figure that out once we reach that point.
What about false positives: Just add an `allow` and move on with your
life :) This isn't something that we have to enforce strictly; the goal
is to catch accidental misuse.
## Test Plan
Clippy found a place where we incorrectly used `std::fs::read_to_string`
This basically splits `list_modules` into a higher level "aggregation"
routine and a lower level "get modules for one search path" routine.
This permits Salsa to cache the lower level components, e.g., many
search paths refer to directories that rarely change. This saves us
interaction with the system.
This did require a fair bit of surgery in terms of being careful about
adding file roots. Namely, now that we rely even more on file roots
existing for correct handling of cache invalidation, there were several
spots in our code that needed to be updated to add roots (that we
weren't previously doing). This feels Not Great, and it would be better
if we had some kind of abstraction that handled this for us. But it
isn't clear to me at this time what that looks like.
This ensures there is some level of consistency between the APIs.
This did require exposing a couple more things on `Module` for good
error messages. This also motivated a switch to an interned struct
instead of a tracked struct. This ensures that `list_modules` and
`resolve_modules` reuse the same `Module` values when the inputs are the
same.
Ref https://github.com/astral-sh/ruff/pull/19883#discussion_r2272520194
The actual implementation wasn't too bad. It's not long
but pretty fiddly. I copied over the tests from the existing
module resolver and adapted them to work with this API. Then
I added a number of my own tests as well.
Previously, if the module was just `foo-stubs`, we'd skip over
stripping the `-stubs` suffix which would lead to us returning
`None`.
This function is now a little convoluted and could be simpler
if we did an intermediate allocation. But I kept the iterative
approach and added a special case to handle `foo-stubs`.
These tests capture existing behavior.
I added these when I stumbled upon what I thought was an
oddity: we prioritize `foo.pyi` over `foo.py`, but
prioritize `foo/__init__.py` over `foo.pyi`.
(I plan to investigate this more closely in follow-up
work. Particularly, to look at other type checkers. It
seems like we may want to change this to always prioritize
stubs.)
In implementing partial stubs I had observed that this continue in the
namespace package code seemed erroneous since the same continue for
partial stubs didn't work. Unfortunately I wasn't confident enough to
push on that hunch. Fortunately I remembered that hunch to make this an
easy fix.
The issue with the continue is that it bails out of the current
search-path without testing any .py files. This breaks when for example
`google` and `google-stubs`/`types-google` are both in the same
site-packages dir -- failing to find a module in `types-google` has us
completely skip over `google`!
Fixes https://github.com/astral-sh/ty/issues/520
by using essentially the same logic for system site-packages, on the
assumption that system site-packages are always a subdir of the stdlib
we were looking for.
The diagram is written in the Dot language, which can
be converted to SVG (or any other image) by GraphViz.
I thought it was a good idea to write this down in
preparation for adding routines that list modules.
Code reuse is likely to be difficult and I wanted to
be sure I understood how it worked.
I mostly just did this because the long string literals were annoying
me. And these can make rustfmt give up on formatting.
I also re-flowed some long comment lines while I was here.
I'm not sure if this used to be used elsewhere, but it no longer is.
And it looks like an internal-only helper function, so just un-export
it.
And note that `ModuleNameIngredient` is also un-exported, so this
function isn't really usable outside of its defining module anyway.
This makes caching of submodules independent of whether `Module`
is itself a Salsa ingredient. In fact, this makes the work done in
the prior commit superfluous. But we're possibly keeping it as an
ingredient for now since it's a bit of a tedious change and we might
need it in the near future.
Ref https://github.com/astral-sh/ruff/pull/19495#pullrequestreview-3045736715
This implements mapping of definitions in stubs to definitions in the
"real" implementation using the approach described in
https://github.com/astral-sh/ty/issues/788#issuecomment-3097000287
I've tested this with goto-definition in vscode with code that uses
`colorama` and `types-colorama`.
Notably this implementation does not add support for stub-mapping stdlib
modules, which can be done as an essentially orthogonal followup in the
implementation of `resolve_real_module`.
Part of https://github.com/astral-sh/ty/issues/788
* [x] basic handling
* [x] parse and discover `@warnings.deprecated` attributes
* [x] associate them with function definitions
* [x] associate them with class definitions
* [x] add a new "deprecated" diagnostic
* [x] ensure diagnostic is styled appropriately for LSPs
(DiagnosticTag::Deprecated)
* [x] functions
* [x] fire on calls
* [x] fire on arbitrary references
* [x] classes
* [x] fire on initializers
* [x] fire on arbitrary references
* [x] methods
* [x] fire on calls
* [x] fire on arbitrary references
* [ ] overloads
* [ ] fire on calls
* [ ] fire on arbitrary references(??? maybe not ???)
* [ ] only fire if the actual selected overload is deprecated
* [ ] dunder desugarring (warn on deprecated `__add__` if `+` is
invoked)
* [ ] alias supression? (don't warn on uses of variables that deprecated
items were assigned to)
* [ ] import logic
* [x] fire on imports of deprecated items
* [ ] suppress subsequent diagnostics if the import diagnostic fired (is
this handled by alias supression?)
* [x] fire on all qualified references (`module.mydeprecated`)
* [x] fire on all references that depend on a `*` import
Fixes https://github.com/astral-sh/ty/issues/153
This change makes it so we aren't doing a directory traversal every time
we ask for completions from a module. Specifically, submodules that
aren't attributes of their parent module can only be discovered by
looking at the directory tree. But we want to avoid doing a directory
scan unless we think there are changes.
To make this work, this change does a little bit of surgery to
`FileRoot`. Previously, a `FileRoot` was only used for library search
paths. Its revision was bumped whenever a file in that tree was added,
deleted or even modified (to support the discovery of `pth` files and
changes to its contents). This generally seems fine since these are
presumably dependency paths that shouldn't change frequently.
In this change, we add a `FileRoot` for the project. But having the
`FileRoot`'s revision bumped for every change in the project makes
caching based on that `FileRoot` rather ineffective. That is, cache
invalidation will occur too aggressively. To the point that there is
little point in adding caching in the first place. To mitigate this, a
`FileRoot`'s revision is only bumped on a change to a child file's
contents when the `FileRoot` is a `LibrarySearchPath`. Otherwise, we
only bump the revision when a file is created or added.
The effect is that, at least in VS Code, when a new module is added or
removed, this change is picked up and the cache is properly invalidated.
Other LSP clients with worse support for file watching (which seems to
be the case for the CoC vim plugin that I use) don't work as well. Here,
the cache is less likely to be invalidated which might cause completions
to have stale results. Unless there's an obvious way to fix or improve
this, I propose punting on improvements here for now.
## Summary
Add a new `Type::EnumLiteral(…)` variant and infer this type for member
accesses on enums.
**Example**: No more `@Todo` types here:
```py
from enum import Enum
class Answer(Enum):
YES = 1
NO = 2
def is_yes(self) -> bool:
return self == Answer.YES
reveal_type(Answer.YES) # revealed: Literal[Answer.YES]
reveal_type(Answer.YES == Answer.NO) # revealed: Literal[False]
reveal_type(Answer.YES.is_yes()) # revealed: bool
```
## Test Plan
* Many new Markdown tests for the new type variant
* Added enum literal types to property tests, ran property tests
## Ecosystem analysis
Summary:
Lots of false positives removed. All of the new diagnostics are
either new true positives (the majority) or known problems. Click for
detailed analysis</summary>
Details:
```diff
AutoSplit (https://github.com/Toufool/AutoSplit)
+ error[call-non-callable] src/capture_method/__init__.py:137:9: Method `__getitem__` of type `bound method CaptureMethodDict.__getitem__(key: Never, /) -> type[CaptureMethodBase]` is not callable on object of type `CaptureMethodDict`
+ error[call-non-callable] src/capture_method/__init__.py:147:9: Method `__getitem__` of type `bound method CaptureMethodDict.__getitem__(key: Never, /) -> type[CaptureMethodBase]` is not callable on object of type `CaptureMethodDict`
+ error[call-non-callable] src/capture_method/__init__.py:148:1: Method `__getitem__` of type `bound method CaptureMethodDict.__getitem__(key: Never, /) -> type[CaptureMethodBase]` is not callable on object of type `CaptureMethodDict`
```
New true positives. That `__getitem__` method is apparently annotated
with `Never` to prevent developers from using it.
```diff
dd-trace-py (https://github.com/DataDog/dd-trace-py)
+ error[invalid-assignment] ddtrace/vendor/psutil/_common.py:29:5: Object of type `None` is not assignable to `Literal[AddressFamily.AF_INET6]`
+ error[invalid-assignment] ddtrace/vendor/psutil/_common.py:33:5: Object of type `None` is not assignable to `Literal[AddressFamily.AF_UNIX]`
```
Arguably true positives:
e0a772c28b/ddtrace/vendor/psutil/_common.py (L29)
```diff
ignite (https://github.com/pytorch/ignite)
+ error[invalid-argument-type] tests/ignite/engine/test_custom_events.py:190:34: Argument to bound method `__call__` is incorrect: Expected `((...) -> Unknown) | None`, found `Literal["123"]`
+ error[invalid-argument-type] tests/ignite/engine/test_custom_events.py:220:37: Argument to function `default_event_filter` is incorrect: Expected `Engine`, found `None`
+ error[invalid-argument-type] tests/ignite/engine/test_custom_events.py:220:43: Argument to function `default_event_filter` is incorrect: Expected `int`, found `None`
+ error[call-non-callable] tests/ignite/engine/test_custom_events.py:561:9: Object of type `CustomEvents` is not callable
+ error[invalid-argument-type] tests/ignite/metrics/test_frequency.py:50:38: Argument to bound method `attach` is incorrect: Expected `Events`, found `CallableEventWithFilter`
```
All true positives. Some of them are inside `pytest.raises(TypeError,
…)` blocks 🙃
```diff
meson (https://github.com/mesonbuild/meson)
+ error[invalid-argument-type] unittests/internaltests.py:243:51: Argument to bound method `__init__` is incorrect: Expected `bool`, found `Literal[MachineChoice.HOST]`
+ error[invalid-argument-type] unittests/internaltests.py:271:51: Argument to bound method `__init__` is incorrect: Expected `bool`, found `Literal[MachineChoice.HOST]`
```
New true positives. Enum literals can not be assigned to `bool`, even if
their value types are `0` and `1`.
```diff
poetry (https://github.com/python-poetry/poetry)
+ error[invalid-assignment] src/poetry/console/exceptions.py:101:5: Object of type `Literal[""]` is not assignable to `InitVar[str]`
```
New false positive, missing support for `InitVar`.
```diff
prefect (https://github.com/PrefectHQ/prefect)
+ error[invalid-argument-type] src/integrations/prefect-dask/tests/test_task_runners.py:193:17: Argument is incorrect: Expected `StateType`, found `Literal[StateType.COMPLETED]`
```
This is confusing. There are two definitions
([one](74d8cd93ee/src/prefect/client/schemas/objects.py (L89-L100)),
[two](https://github.com/PrefectHQ/prefect/blob/main/src/prefect/server/schemas/states.py#L40))
of the `StateType` enum. Here, we're trying to assign one to the other.
I don't think that should be allowed, so this is a true positive (?).
```diff
python-htmlgen (https://github.com/srittau/python-htmlgen)
+ error[invalid-assignment] test_htmlgen/form.py:51:9: Object of type `str` is not assignable to attribute `autocomplete` of type `Autocomplete | None`
+ error[invalid-assignment] test_htmlgen/video.py:38:9: Object of type `str` is not assignable to attribute `preload` of type `Preload | None`
```
True positives. [The stubs are
wrong](01e3b911ac/htmlgen/form.pyi (L8-L10)).
These should not contain type annotations, but rather just `OFF = ...`.
```diff
rotki (https://github.com/rotki/rotki)
+ error[invalid-argument-type] rotkehlchen/tests/unit/test_serialization.py:62:30: Argument to bound method `deserialize` is incorrect: Expected `str`, found `Literal[15]`
```
New true positive.
```diff
vision (https://github.com/pytorch/vision)
+ error[unresolved-attribute] test/test_extended_models.py:302:17: Type `type[WeightsEnum]` has no attribute `DEFAULT`
+ error[unresolved-attribute] test/test_extended_models.py:302:58: Type `type[WeightsEnum]` has no attribute `DEFAULT`
```
Also new true positives. No `DEFAULT` member exists on `WeightsEnum`.
While we did previously support submodule completions via our
`all_members` API, that only works when submodules are attributes of
their parent module. For example, `os.path`. But that didn't work when
the submodule was not an attribute of its parent. For example,
`http.client`. To make the latter work, we read the directory of the
parent module to discover its submodules.
## Summary
Setting `TY_MEMORY_REPORT=full` will generate and print a memory usage
report to the CLI after a `ty check` run:
```
=======SALSA STRUCTS=======
`Definition` metadata=7.24MB fields=17.38MB count=181062
`Expression` metadata=4.45MB fields=5.94MB count=92804
`member_lookup_with_policy_::interned_arguments` metadata=1.97MB fields=2.25MB count=35176
...
=======SALSA QUERIES=======
`File -> ty_python_semantic::semantic_index::SemanticIndex`
metadata=11.46MB fields=88.86MB count=1638
`Definition -> ty_python_semantic::types::infer::TypeInference`
metadata=24.52MB fields=86.68MB count=146018
`File -> ruff_db::parsed::ParsedModule`
metadata=0.12MB fields=69.06MB count=1642
...
=======SALSA SUMMARY=======
TOTAL MEMORY USAGE: 577.61MB
struct metadata = 29.00MB
struct fields = 35.68MB
memo metadata = 103.87MB
memo fields = 409.06MB
```
Eventually, we should integrate these numbers into CI in some form. The
one limitation currently is that heap allocations in salsa structs (e.g.
interned values) are not tracked, but memoized values should have full
coverage. We may also want a peak memory usage counter (that accounts
for non-salsa memory), but that is relatively simple to profile manually
(e.g. `time -v ty check`) and would require a compile-time option to
avoid runtime overhead.