mirror of
https://github.com/astral-sh/ruff.git
synced 2025-09-28 12:55:05 +00:00

Implements definition-level type inference, with basic control flow (only if statements and if expressions so far) in Salsa. There are a couple key ideas here: 1) We can do type inference queries at any of three region granularities: an entire scope, a single definition, or a single expression. These are represented by the `InferenceRegion` enum, and the entry points are the salsa queries `infer_scope_types`, `infer_definition_types`, and `infer_expression_types`. Generally per-scope will be used for scopes that we are directly checking and per-definition will be used anytime we are looking up symbol types from another module/scope. Per-expression should be uncommon: used only for the RHS of an unpacking or multi-target assignment (to avoid re-inferring the RHS once per symbol defined in the assignment) and for test nodes in type narrowing (e.g. the `test` of an `If` node). All three queries return a `TypeInference` with a map of types for all definitions and expressions within their region. If you do e.g. scope-level inference, when it hits a definition, or an independently-inferable expression, it should use the relevant query (which may already be cached) to get all types within the smaller region. This avoids double-inferring smaller regions, even though larger regions encompass smaller ones. 2) Instead of building a control-flow graph and lazily traversing it to find definitions which reach a use of a name (which is O(n^2) in the worst case), instead semantic indexing builds a use-def map, where every use of a name knows which definitions can reach that use. We also no longer track all definitions of a symbol in the symbol itself; instead the use-def map also records which defs remain visible at the end of the scope, and considers these the publicly-visible definitions of the symbol (see below). Major items left as TODOs in this PR, to be done in follow-up PRs: 1) Free/global references aren't supported yet (only lookup based on definitions in current scope), which means the override-check example doesn't currently work. This is the first thing I'll fix as follow-up to this PR. 2) Control flow outside of if statements and expressions. 3) Type narrowing. There are also some smaller relevant changes here: 1) Eliminate `Option` in the return type of member lookups; instead always return `Type::Unbound` for a name we can't find. Also use `Type::Unbound` for modules we can't resolve (not 100% sure about this one yet.) 2) Eliminate the use of the terms "public" and "root" to refer to module-global scope or symbols. Instead consistently use the term "module-global". It's longer, but it's the clearest, and the most consistent with typical Python terminology. In particular I don't like "public" for this use because it has other implications around author intent (is an underscore-prefixed module-global symbol "public"?). And "root" is just not commonly used for this in Python. 3) Eliminate the `PublicSymbol` Salsa ingredient. Many non-module-global symbols can also be seen from other scopes (e.g. by a free var in a nested scope, or by class attribute access), and thus need to have a "public type" (that is, the type not as seen from a particular use in the control flow of the same scope, but the type as seen from some other scope.) So all symbols need to have a "public type" (here I want to keep the use of the term "public", unless someone has a better term to suggest -- since it's "public type of a symbol" and not "public symbol" the confusion with e.g. initial underscores is less of an issue.) At least initially, I would like to try not having special handling for module-global symbols vs other symbols. 4) Switch to using "definitions that reach end of scope" rather than "all definitions" in determining the public type of a symbol. I'm convinced that in general this is the right way to go. We may want to refine this further in future for some free-variable cases, but it can be changed purely by making changes to the building of the use-def map (the `public_definitions` index in it), without affecting any other code. One consequence of combining this with no control-flow support (just last-definition-wins) is that some inference tests now give more wrong-looking results; I left TODO comments on these tests to fix them when control flow is added. And some potential areas for consideration in the future: 1) Should `symbol_ty` be a Salsa query? This would require making all symbols a Salsa ingredient, and tracking even more dependencies. But it would save some repeated reconstruction of unions, for symbols with multiple public definitions. For now I'm not making it a query, but open to changing this in future with actual perf evidence that it's better.
183 lines
3.9 KiB
Rust
183 lines
3.9 KiB
Rust
use crate::slice::IndexSlice;
|
|
use crate::Idx;
|
|
use std::borrow::{Borrow, BorrowMut};
|
|
use std::fmt::{Debug, Formatter};
|
|
use std::marker::PhantomData;
|
|
use std::ops::{Deref, DerefMut, RangeBounds};
|
|
|
|
/// An owned sequence of `T` indexed by `I`
|
|
#[derive(Clone, PartialEq, Eq, Hash)]
|
|
#[repr(transparent)]
|
|
pub struct IndexVec<I, T> {
|
|
pub raw: Vec<T>,
|
|
index: PhantomData<I>,
|
|
}
|
|
|
|
impl<I: Idx, T> IndexVec<I, T> {
|
|
#[inline]
|
|
pub fn new() -> Self {
|
|
Self {
|
|
raw: Vec::new(),
|
|
index: PhantomData,
|
|
}
|
|
}
|
|
|
|
#[inline]
|
|
pub fn with_capacity(capacity: usize) -> Self {
|
|
Self {
|
|
raw: Vec::with_capacity(capacity),
|
|
index: PhantomData,
|
|
}
|
|
}
|
|
|
|
#[inline]
|
|
pub fn from_raw(raw: Vec<T>) -> Self {
|
|
Self {
|
|
raw,
|
|
index: PhantomData,
|
|
}
|
|
}
|
|
|
|
#[inline]
|
|
pub fn drain<R: RangeBounds<usize>>(&mut self, range: R) -> impl Iterator<Item = T> + '_ {
|
|
self.raw.drain(range)
|
|
}
|
|
|
|
#[inline]
|
|
pub fn truncate(&mut self, a: usize) {
|
|
self.raw.truncate(a);
|
|
}
|
|
|
|
#[inline]
|
|
pub fn as_slice(&self) -> &IndexSlice<I, T> {
|
|
IndexSlice::from_raw(&self.raw)
|
|
}
|
|
|
|
#[inline]
|
|
pub fn as_mut_slice(&mut self) -> &mut IndexSlice<I, T> {
|
|
IndexSlice::from_raw_mut(&mut self.raw)
|
|
}
|
|
|
|
#[inline]
|
|
pub fn push(&mut self, data: T) -> I {
|
|
let index = self.next_index();
|
|
self.raw.push(data);
|
|
index
|
|
}
|
|
|
|
#[inline]
|
|
pub fn next_index(&self) -> I {
|
|
I::new(self.raw.len())
|
|
}
|
|
|
|
#[inline]
|
|
pub fn shrink_to_fit(&mut self) {
|
|
self.raw.shrink_to_fit();
|
|
}
|
|
|
|
#[inline]
|
|
pub fn resize(&mut self, new_len: usize, value: T)
|
|
where
|
|
T: Clone,
|
|
{
|
|
self.raw.resize(new_len, value);
|
|
}
|
|
}
|
|
|
|
impl<I, T> Debug for IndexVec<I, T>
|
|
where
|
|
T: Debug,
|
|
{
|
|
fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
|
|
std::fmt::Debug::fmt(&self.raw, f)
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> Deref for IndexVec<I, T> {
|
|
type Target = IndexSlice<I, T>;
|
|
|
|
fn deref(&self) -> &Self::Target {
|
|
self.as_slice()
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> DerefMut for IndexVec<I, T> {
|
|
fn deref_mut(&mut self) -> &mut Self::Target {
|
|
self.as_mut_slice()
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> Borrow<IndexSlice<I, T>> for IndexVec<I, T> {
|
|
fn borrow(&self) -> &IndexSlice<I, T> {
|
|
self
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> BorrowMut<IndexSlice<I, T>> for IndexVec<I, T> {
|
|
fn borrow_mut(&mut self) -> &mut IndexSlice<I, T> {
|
|
self
|
|
}
|
|
}
|
|
|
|
impl<I, T> Extend<T> for IndexVec<I, T> {
|
|
#[inline]
|
|
fn extend<Iter: IntoIterator<Item = T>>(&mut self, iter: Iter) {
|
|
self.raw.extend(iter);
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> FromIterator<T> for IndexVec<I, T> {
|
|
#[inline]
|
|
fn from_iter<Iter: IntoIterator<Item = T>>(iter: Iter) -> Self {
|
|
Self::from_raw(Vec::from_iter(iter))
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> IntoIterator for IndexVec<I, T> {
|
|
type IntoIter = std::vec::IntoIter<T>;
|
|
type Item = T;
|
|
|
|
#[inline]
|
|
fn into_iter(self) -> std::vec::IntoIter<T> {
|
|
self.raw.into_iter()
|
|
}
|
|
}
|
|
|
|
impl<'a, I: Idx, T> IntoIterator for &'a IndexVec<I, T> {
|
|
type IntoIter = std::slice::Iter<'a, T>;
|
|
type Item = &'a T;
|
|
|
|
#[inline]
|
|
fn into_iter(self) -> std::slice::Iter<'a, T> {
|
|
self.iter()
|
|
}
|
|
}
|
|
|
|
impl<'a, I: Idx, T> IntoIterator for &'a mut IndexVec<I, T> {
|
|
type IntoIter = std::slice::IterMut<'a, T>;
|
|
type Item = &'a mut T;
|
|
|
|
#[inline]
|
|
fn into_iter(self) -> std::slice::IterMut<'a, T> {
|
|
self.iter_mut()
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T> Default for IndexVec<I, T> {
|
|
#[inline]
|
|
fn default() -> Self {
|
|
IndexVec::new()
|
|
}
|
|
}
|
|
|
|
impl<I: Idx, T, const N: usize> From<[T; N]> for IndexVec<I, T> {
|
|
#[inline]
|
|
fn from(array: [T; N]) -> Self {
|
|
IndexVec::from_raw(array.into())
|
|
}
|
|
}
|
|
|
|
// Whether `IndexVec` is `Send` depends only on the data,
|
|
// not the phantom data.
|
|
#[allow(unsafe_code)]
|
|
unsafe impl<I: Idx, T> Send for IndexVec<I, T> where T: Send {}
|