## Summary
This PR builds on the changes in #16220 to pass a target Python version
to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which
collects version-related syntax errors while parsing. These syntax
errors are then turned into `Message`s in ruff (in preview mode).
This PR only detects one syntax error (`match` statement before Python
3.10), but it has been pretty quick to extend to several other simple
errors (see #16308 for example).
## Test Plan
The current tests are CLI tests in the linter crate, but these could be
supplemented with inline parser tests after #16357.
I also tested the display of these syntax errors in VS Code:


---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
In https://github.com/astral-sh/ruff/pull/16306#discussion_r1966290700,
@carljm pointed out that #16306 introduced a terminology problem, with
too many things called a "constraint". This is a follow-up PR that
renames `Constraint` to `Predicate` to hopefully clear things up a bit.
So now we have that:
- a _predicate_ is a Python expression that might influence type
inference
- a _narrowing constraint_ is a list of predicates that constraint the
type of a binding that is visible at a use
- a _visibility constraint_ is a ternary formula of predicates that
define whether a binding is visible or a statement is reachable
This is a pure renaming, with no behavioral changes.
## Summary
Model dunder-calls correctly (and in one single place), by implementing
this behavior (using `__getitem__` as an example).
```py
def getitem_desugared(obj: object, key: object) -> object:
getitem_callable = find_in_mro(type(obj), "__getitem__")
if hasattr(getitem_callable, "__get__"):
getitem_callable = getitem_callable.__get__(obj, type(obj))
return getitem_callable(key)
```
See the new `calls/dunder.md` test suite for more information. The new
behavior also needs much fewer lines of code (the diff is positive due
to new tests).
## Test Plan
New tests; fix TODOs in existing tests.
This PR adds an implementation of [association
lists](https://en.wikipedia.org/wiki/Association_list), and uses them to
replace the previous `BitSet`/`SmallVec` representation for narrowing
constraints.
An association list is a linked list of key/value pairs. We additionally
guarantee that the elements of an association list are sorted (by their
keys), and that they do not contain any entries with duplicate keys.
Association lists have fallen out of favor in recent decades, since you
often need operations that are inefficient on them. In particular,
looking up a random element by index is O(n), just like a linked list;
and looking up an element by key is also O(n), since you must do a
linear scan of the list to find the matching element. Luckily we don't
need either of those operations for narrowing constraints!
The typical implementation also suffers from poor cache locality and
high memory allocation overhead, since individual list cells are
typically allocated separately from the heap. We solve that last problem
by storing the cells of an association list in an `IndexVec` arena.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
I am working on a project that uses ruff linters' docs to generate a
fine-tuning dataset for LLMs.
To achieve this, I first ran the command `ruff rule --all
--output-format json` to retrieve all the rules. Then, I parsed the
explanation field to get these 3 consistent sections:
- `Why is this bad?`
- `What it does`
- `Example`
However, during the initial processing, I noticed that the markdown
headings are not that consistent. For instance:
- In most cases, `Use instead` appears as a normal paragraph within the
`Example` section, but in the file
`crates/ruff_linter/src/rules/flake8_bandit/rules/django_extra.rs` it is
a level-2 heading
- The heading "What it does**?**" is used in some places, while others
consistently use "What it does"
- There are 831 `Example` headings and 65 `Examples`. But all of them
only have one example case
This PR normalized these across all rules.
## Test Plan
CI are passed.
## Summary
Add a diagnostic if a pure instance variable is accessed on a class object. For example
```py
class C:
instance_only: str
def __init__(self):
self.instance_only = "a"
# error: Attribute `instance_only` can only be accessed on instances, not on the class object `Literal[C]` itself.
C.instance_only
```
---------
Co-authored-by: David Peter <mail@david-peter.de>
## Summary
This PR is another step in preparing to detect syntax errors in the
parser. It introduces the new `per-file-target-version` top-level
configuration option, which holds a mapping of compiled glob patterns to
Python versions. I intend to use the
`LinterSettings::resolve_target_version` method here to pass to the
parser:
f50849aeef/crates/ruff_linter/src/linter.rs (L491-L493)
## Test Plan
I added two new CLI tests to show that the `per-file-target-version` is
respected in both the formatter and the linter.
## Summary
Add support for `@classmethod`s.
```py
class C:
@classmethod
def f(cls, x: int) -> str:
return "a"
reveal_type(C.f(1)) # revealed: str
```
## Test Plan
New Markdown tests
## Summary
* Existing example did not include RawSQL() call like it should
* Also clarify the example a bit to make it clearer that the code is not
secure
## Test Plan
N/A, only documentation updated
## Summary
I spotted a minor mistake in my descriptor protocol implementation where
`C.descriptor` would pass the meta type (`type`) of the type of `C`
(`Literal[C]`) as the owner argument to `__get__`, instead of passing
`Literal[C]` directly.
## Test Plan
New test.
Two related changes. For context:
1. We were maintaining two separate arenas of `Constraint`s in each
use-def map. One was used for narrowing constraints, and the other for
visibility constraints. The visibility constraint arena was interned,
ensuring that we always used the same ID for any particular
`Constraint`. The narrowing constraint arena was not interned.
2. The TDD code relies on _all_ TDD nodes being interned and reduced.
This is an important requirement for TDDs to be a canonical form, which
allows us to use a single int comparison to test for "always true/false"
and to compare two TDDs for equivalence. But we also need to support an
individual `Constraint` having multiple values in a TDD evaluation (e.g.
to handle a `while` condition having different values the first time
it's evaluated vs later times). Previously, we handled that by
introducing a "copy" number, which was only there as a disambiguator, to
allow an interned, deduplicated constraint ID to appear in the TDD
formula multiple times.
A better way to handle (2) is to not intern the constraints in the
visibility constraint arena! The caller now gets to decide: if they add
a `Constraint` to the arena more than once, they get distinct
`ScopedConstraintId`s — which the TDD code will treat as distinct
variables, allowing them to take on different values in the ternary
function.
With that in place, we can then consolidate on a single (non-interned)
arena, which is shared for both narrowing and visibility constraints.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
## Summary
Resolves 3/4 requests in #16217:
- ✅ Remove not special methods: `__cmp__`, `__div__`, `__nonzero__`, and
`__unicode__`.
- ✅ Add special methods: `__next__`, `__buffer__`, `__class_getitem__`,
`__mro_entries__`, `__release_buffer__`, and `__subclasshook__`.
- ✅ Support positional-only arguments.
- ❌ Add support for module functions `__dir__` and `__getattr__`. As
mentioned in the issue the check is scoped for methods rather than
module functions. I am hesitant to expand the scope of this check
without a discussion.
## Test Plan
- Manually confirmed each example file from the issue functioned as
expected.
- Ran cargo nextest to ensure `unexpected_special_method_signature` test
still passed.
Fixes#16217.
## Summary
This is just a small refactor to move workspace related structs and impl
out from `server.rs` where `Server` is defined and into a new
`workspace.rs`.
## Summary
This PR achieves the following:
* Add support for checking method calls, and inferring return types from
method calls. For example:
```py
reveal_type("abcde".find("abc")) # revealed: int
reveal_type("foo".encode(encoding="utf-8")) # revealed: bytes
"abcde".find(123) # error: [invalid-argument-type]
class C:
def f(self) -> int:
pass
reveal_type(C.f) # revealed: <function `f`>
reveal_type(C().f) # revealed: <bound method: `f` of `C`>
C.f() # error: [missing-argument]
reveal_type(C().f()) # revealed: int
```
* Implement the descriptor protocol, i.e. properly call the `__get__`
method when a descriptor object is accessed through a class object or an
instance of a class. For example:
```py
from typing import Literal
class Ten:
def __get__(self, instance: object, owner: type | None = None) ->
Literal[10]:
return 10
class C:
ten: Ten = Ten()
reveal_type(C.ten) # revealed: Literal[10]
reveal_type(C().ten) # revealed: Literal[10]
```
* Add support for member lookup on intersection types.
* Support type inference for `inspect.getattr_static(obj, attr)` calls.
This was mostly used as a debugging tool during development, but seems
more generally useful. It can be used to bypass the descriptor protocol.
For the example above:
```py
from inspect import getattr_static
reveal_type(getattr_static(C, "ten")) # revealed: Ten
```
* Add a new `Type::Callable(…)` variant with the following sub-variants:
* `Type::Callable(CallableType::BoundMethod(…))` — represents bound
method objects, e.g. `C().f` above
* `Type::Callable(CallableType::MethodWrapperDunderGet(…))` — represents
`f.__get__` where `f` is a function
* `Type::Callable(WrapperDescriptorDunderGet)` — represents
`FunctionType.__get__`
* Add new known classes:
* `types.MethodType`
* `types.MethodWrapperType`
* `types.WrapperDescriptorType`
* `builtins.range`
## Performance analysis
On this branch, we do more work. We need to do more call checking, since
we now check all method calls. We also need to do ~twice as many member
lookups, because we need to check if a `__get__` attribute exists on
accessed members.
A brief analysis on `tomllib` shows that we now call `Type::call` 1780
times, compared to 612 calls before.
## Limitations
* Data descriptors are not yet supported, i.e. we do not infer correct
types for descriptor attribute accesses in `Store` context and do not
check writes to descriptor attributes. I felt like this was something
that could be split out as a follow-up without risking a major
architectural change.
* We currently distinguish between `Type::member` (with descriptor
protocol) and `Type::static_member` (without descriptor protocol). The
former corresponds to `obj.attr`, the latter corresponds to
`getattr_static(obj, "attr")`. However, to model some details correctly,
we would also need to distinguish between a static member lookup *with*
and *without* instance variables. The lookup without instance variables
corresponds to `find_name_in_mro`
[here](https://docs.python.org/3/howto/descriptor.html#invocation-from-an-instance).
We currently approximate both using `member_static`, which leads to two
open TODOs. Changing this would be a larger refactoring of
`Type::own_instance_member`, so I chose to leave it out of this PR.
## Test Plan
* New `call/methods.md` test suite for method calls
* New tests in `descriptor_protocol.md`
* New `call/getattr_static.md` test suite for `inspect.getattr_static`
* Various updated tests
## Summary
This avoids looking up `__bool__` on class `bool` for every
`Type::Instance(bool).bool()` call. 1% performance win on cold cache, 4%
win on incremental performance.
This updates the `SymbolBindings` and `SymbolDeclarations` types to use
a single smallvec of live bindings/declarations, instead of splitting
that out into separate containers for each field.
I'm seeing an 11-13% `cargo bench` performance improvement with this
locally (for both cold and incremental). I'm interested to see if
Codspeed agrees!
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
A minor cleanup that breaks up a `HashMap` of an enum into separate
`HashMap`s for each variant. (These separate fields were already how
this cache was being described in the big comment at the top of the
file!)
This is a small tweak to avoid adding the callable `Type` on the error
value itself. Namely, it's always available regardless of the error, and
it's easy to pass it down explicitly to the diagnostic generating code.
It's likely that the other `CallBindingError` variants will also want
the callable `Type` to improve diagnostics too. This way, we don't have
to duplicate the `Type` on each variant. It's just available to all of
them.
Ref https://github.com/astral-sh/ruff/pull/16239#discussion_r1962352646
## Summary
Follow up on the discussion
[here](https://github.com/astral-sh/ruff/pull/16121#discussion_r1962973298).
Replace builtin classes with custom placeholder names, which should
hopefully make the tests a bit easier to understand.
I carefully renamed things one after the other, to make sure that there
is no functional change in the tests.
## Summary
This PR should help in
https://github.com/astral-sh/ruff-vscode/issues/676.
There are two issues that this is trying to fix all related to the way
shutdown should happen as per the protocol:
1. After the server handled the [shutdown
request](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#shutdown)
and while waiting for the exit notification:
> If a server receives requests after a shutdown request those requests
should error with `InvalidRequest`.
But, we raised an error and exited. This PR fixes it by entering a loop
which responds to any request during this period with `InvalidRequest`
2. If the server received an [exit
notification](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#exit)
but the shutdown request was never received, the server handled that by
logging and exiting with success but as per the spec:
> The server should exit with success code 0 if the shutdown request has
been received before; otherwise with error code 1.
So, this PR fixes that as well by raising an error in this case.
## Test Plan
I'm not sure how to go about testing this without using a mock server.
## Summary
This is part of the preparation for detecting syntax errors in the
parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested
in [this
comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509),
I started working on a `ParseOptions` struct that could be stored in the
parser. For this initial refactor, I only made it hold the existing
`Mode` option, but for syntax errors, we will also need it to have a
`PythonVersion`. For that use case, I'm picturing something like a
`ParseOptions::with_python_version` method, so you can extend the
current calls to something like
```rust
ParseOptions::from(mode).with_python_version(settings.target_version)
```
But I thought it was worth adding `ParseOptions` alone without changing
any other behavior first.
Most of the diff is just updating call sites taking `Mode` to take
`ParseOptions::from(Mode)` or those taking `PySourceType`s to take
`ParseOptions::from(PySourceType)`. The interesting changes are in the
new `parser/options.rs` file and smaller parts of `parser/mod.rs` and
`ruff_python_parser/src/lib.rs`.
## Test Plan
Existing tests, this should not change any behavior.
We now resolve references in "eager" scopes correctly — using the
bindings and declarations that are visible at the point where the eager
scope is created, not the "public" type of the symbol (typically the
bindings visible at the end of the scope).
---------
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>