ruff/crates/ty_python_semantic/src/lib.rs
Dhruv Manilawala 78054824c0
[ty] Add support for __all__ (#17856)
## Summary

This PR adds support for the `__all__` module variable.

Reference spec:
https://typing.python.org/en/latest/spec/distributing.html#library-interface-public-and-private-symbols

This PR adds a new `dunder_all_names` query that returns a set of
`Name`s defined in the `__all__` variable of the given `File`. The query
works by implementing the `StatementVisitor` and collects all the names
by recognizing the supported idioms as mentioned in the spec. Any idiom
that's not recognized are ignored.

The current implementation is minimum to what's required for us to
remove all the false positives that this is causing. Refer to the
"Follow-ups" section below to see what we can do next. I'll a open
separate issue to keep track of them.

Closes: astral-sh/ty#106 
Closes: astral-sh/ty#199

### Follow-ups

* Diagnostics:
* Add warning diagnostics for unrecognized `__all__` idioms, `__all__`
containing non-string element
* Add an error diagnostic for elements that are present in `__all__` but
not defined in the module. This could lead to runtime error
* Maybe we should return `<type>` instead of `Unknown | <type>` for
`module.__all__`. For example:
https://playknot.ruff.rs/2a6fe5d7-4e16-45b1-8ec3-d79f2d4ca894
* Mark a symbol that's mentioned in `__all__` as used otherwise it could
raise (possibly in the future) "unused-name" diagnostic

Supporting diagnostics will require that we update the return type of
the query to be something other than `Option<FxHashSet<Name>>`,
something that behaves like a result and provides a way to check whether
a name exists in `__all__`, loop over elements in `__all__`, loop over
the invalid elements, etc.

## Ecosystem analysis

The following are the maximum amount of diagnostics **removed** in the
ecosystem:

* "Type <module '...'> has no attribute ..."
    * `collections.abc` - 14
    * `numpy` - 35534
    * `numpy.ma` - 296
    * `numpy.char` - 37
    * `numpy.testing` - 175
    * `hashlib` - 311
    * `scipy.fft` - 2
    * `scipy.stats` - 38
* "Module '...' has no member ..."
    * `collections.abc` - 85
    * `numpy` - 508
    * `numpy.testing` - 741
    * `hashlib` - 36
    * `scipy.stats` - 68
    * `scipy.interpolate` - 7
    * `scipy.signal` - 5

The following modules have dynamic `__all__` definition, so `ty` assumes
that `__all__` doesn't exists in that module:
* `scipy.stats`
(95a5d6ea8b/scipy/stats/__init__.py (L665))
* `scipy.interpolate`
(95a5d6ea8b/scipy/interpolate/__init__.py (L221))
* `scipy.signal` (indirectly via
95a5d6ea8b/scipy/signal/_signal_api.py (L30))
* `numpy.testing`
(de784cd6ee/numpy/testing/__init__.py (L16-L18))

~There's this one category of **false positives** that have been added:~
Fixed the false positives by also ignoring `__all__` from a module that
uses unrecognized idioms.

<details><summary>Details about the false postivie:</summary>
<p>

The `scipy.stats` module has dynamic `__all__` and it imports a bunch of
symbols via star imports. Some of those modules have a mix of valid and
invalid `__all__` idioms. For example, in
95a5d6ea8b/scipy/stats/distributions.py (L18-L24),
2 out of 4 `__all__` idioms are invalid but currently `ty` recognizes
two of them and says that the module has a `__all__` with 5 values. This
leads to around **2055** newly added false positives of the form:
```
Type <module 'scipy.stats'> has no attribute ...
```

I think the fix here is to completely ignore `__all__`, not only if
there are invalid elements in it, but also if there are unrecognized
idioms used in the module.

</p>
</details> 

## Test Plan

Add a bunch of test cases using the new `ty_extensions.dunder_all_names`
function to extract a module's `__all__` names.

Update various test cases to remove false positives around `*` imports
and re-export convention.

Add new test cases for named import behavior as `*` imports covers all
of it already (thanks Alex!).
2025-05-07 21:42:42 +05:30

53 lines
1.6 KiB
Rust

use std::hash::BuildHasherDefault;
use rustc_hash::FxHasher;
use crate::lint::{LintRegistry, LintRegistryBuilder};
use crate::suppression::{INVALID_IGNORE_COMMENT, UNKNOWN_RULE, UNUSED_IGNORE_COMMENT};
pub use db::Db;
pub use module_name::ModuleName;
pub use module_resolver::{resolve_module, system_module_search_paths, KnownModule, Module};
pub use program::{Program, ProgramSettings, PythonPath, SearchPathSettings};
pub use python_platform::PythonPlatform;
pub use semantic_model::{HasType, SemanticModel};
pub use site_packages::SysPrefixPathOrigin;
pub mod ast_node_ref;
mod db;
mod dunder_all;
pub mod lint;
pub(crate) mod list;
mod module_name;
mod module_resolver;
mod node_key;
mod program;
mod python_platform;
pub mod semantic_index;
mod semantic_model;
pub(crate) mod site_packages;
mod suppression;
pub(crate) mod symbol;
pub mod types;
mod unpack;
mod util;
type FxOrderSet<V> = ordermap::set::OrderSet<V, BuildHasherDefault<FxHasher>>;
/// Returns the default registry with all known semantic lints.
pub fn default_lint_registry() -> &'static LintRegistry {
static REGISTRY: std::sync::LazyLock<LintRegistry> = std::sync::LazyLock::new(|| {
let mut registry = LintRegistryBuilder::default();
register_lints(&mut registry);
registry.build()
});
&REGISTRY
}
/// Register all known semantic lints.
pub fn register_lints(registry: &mut LintRegistryBuilder) {
types::register_lints(registry);
registry.register_lint(&UNUSED_IGNORE_COMMENT);
registry.register_lint(&UNKNOWN_RULE);
registry.register_lint(&INVALID_IGNORE_COMMENT);
}