Commit graph

6592 commits

Author SHA1 Message Date
Brent Westbrook
78806361fd
Start detecting version-related syntax errors in the parser (#16090)
## Summary

This PR builds on the changes in #16220 to pass a target Python version
to the parser. It also adds the `Parser::unsupported_syntax_errors` field, which
collects version-related syntax errors while parsing. These syntax
errors are then turned into `Message`s in ruff (in preview mode).

This PR only detects one syntax error (`match` statement before Python
3.10), but it has been pretty quick to extend to several other simple
errors (see #16308 for example).

## Test Plan

The current tests are CLI tests in the linter crate, but these could be
supplemented with inline parser tests after #16357.

I also tested the display of these syntax errors in VS Code:


![image](https://github.com/user-attachments/assets/062b4441-740e-46c3-887c-a954049ef26e)

![image](https://github.com/user-attachments/assets/101f55b8-146c-4d59-b6b0-922f19bcd0fa)

---------

Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2025-02-25 23:03:48 -05:00
Douglas Creager
b39a4ad01d
[red-knot] Rename constraint to predicate (#16382)
In https://github.com/astral-sh/ruff/pull/16306#discussion_r1966290700,
@carljm pointed out that #16306 introduced a terminology problem, with
too many things called a "constraint". This is a follow-up PR that
renames `Constraint` to `Predicate` to hopefully clear things up a bit.
So now we have that:

- a _predicate_ is a Python expression that might influence type
inference
- a _narrowing constraint_ is a list of predicates that constraint the
type of a binding that is visible at a use
- a _visibility constraint_ is a ternary formula of predicates that
define whether a binding is visible or a statement is reachable

This is a pure renaming, with no behavioral changes.
2025-02-25 14:52:40 -05:00
David Peter
86b01d2d3c
[red-knot] Correct modeling of dunder calls (#16368)
## Summary

Model dunder-calls correctly (and in one single place), by implementing
this behavior (using `__getitem__` as an example).

```py
def getitem_desugared(obj: object, key: object) -> object:
    getitem_callable = find_in_mro(type(obj), "__getitem__")
    if hasattr(getitem_callable, "__get__"):
        getitem_callable = getitem_callable.__get__(obj, type(obj))

    return getitem_callable(key)
```

See the new `calls/dunder.md` test suite for more information. The new
behavior also needs much fewer lines of code (the diff is positive due
to new tests).

## Test Plan

New tests; fix TODOs in existing tests.
2025-02-25 20:38:15 +01:00
David Peter
f88328eedd
[red-knot] Handle possibly-unbound instance members (#16363)
## Summary

Adds support for possibly-unbound/undeclared instance members.

## Test Plan

New MD tests.
2025-02-25 20:00:38 +01:00
Douglas Creager
fa76f6cbb2
[red-knot] Use arena-allocated association lists for narrowing constraints (#16306)
This PR adds an implementation of [association
lists](https://en.wikipedia.org/wiki/Association_list), and uses them to
replace the previous `BitSet`/`SmallVec` representation for narrowing
constraints.

An association list is a linked list of key/value pairs. We additionally
guarantee that the elements of an association list are sorted (by their
keys), and that they do not contain any entries with duplicate keys.

Association lists have fallen out of favor in recent decades, since you
often need operations that are inefficient on them. In particular,
looking up a random element by index is O(n), just like a linked list;
and looking up an element by key is also O(n), since you must do a
linear scan of the list to find the matching element. Luckily we don't
need either of those operations for narrowing constraints!

The typical implementation also suffers from poor cache locality and
high memory allocation overhead, since individual list cells are
typically allocated separately from the heap. We solve that last problem
by storing the cells of an association list in an `IndexVec` arena.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2025-02-25 10:58:56 -05:00
Alex Waygood
5c007db7e2
[red-knot] Rewrite Type::try_iterate() to improve type inference and diagnostic messages (#16321) 2025-02-25 14:02:03 +00:00
Muspi Merol
a1a536b2c5
Normalize inconsistent markdown headings in docstrings (#16364)
I am working on a project that uses ruff linters' docs to generate a
fine-tuning dataset for LLMs.

To achieve this, I first ran the command `ruff rule --all
--output-format json` to retrieve all the rules. Then, I parsed the
explanation field to get these 3 consistent sections:

- `Why is this bad?`
- `What it does`
- `Example`

However, during the initial processing, I noticed that the markdown
headings are not that consistent. For instance:

- In most cases, `Use instead` appears as a normal paragraph within the
`Example` section, but in the file
`crates/ruff_linter/src/rules/flake8_bandit/rules/django_extra.rs` it is
a level-2 heading
- The heading "What it does**?**" is used in some places, while others
consistently use "What it does"
- There are 831 `Example` headings and 65 `Examples`. But all of them
only have one example case

This PR normalized these across all rules.

## Test Plan

CI are passed.
2025-02-25 15:42:55 +05:30
David Peter
aac79e453a
[red-knot] Better diagnostics for method calls (#16362)
## Summary

Add better error messages and additional spans for method calls. Can be
reviewed commit-by-commit.

before:

```
error: lint:invalid-argument-type
 --> /home/shark/playground/test.py:6:10
  |
5 | c = C()
6 | c.square("hello")  # error: [invalid-argument-type]
  |          ^^^^^^^ Object of type `Literal["hello"]` cannot be assigned to parameter 2 (`x`); expected type `int`
7 |
8 | # import inspect
  |
```

after:

```
error: lint:invalid-argument-type
 --> /home/shark/playground/test.py:6:10
  |
5 | c = C()
6 | c.square("hello")  # error: [invalid-argument-type]
  |          ^^^^^^^ Object of type `Literal["hello"]` cannot be assigned to parameter 2 (`x`) of bound method `square`; expected type `int`
7 |
8 | # import inspect
  |
 ::: /home/shark/playground/test.py:2:22
  |
1 | class C:
2 |     def square(self, x: int) -> int:
  |                      ------ info: parameter declared in function definition here
3 |         return x * x
  |
```

## Test Plan

New snapshot test
2025-02-25 09:58:08 +01:00
Micha Reiser
fd7b3c83ad
[red-knot] Add argfile and windows glob path support (#16353) 2025-02-25 08:43:13 +01:00
Micha Reiser
d895ee0014
[red-knot] Handle pipe-errors gracefully (#16354) 2025-02-25 08:42:52 +01:00
Micha Reiser
4732c58829
Rename venv-path to python (#16347) 2025-02-24 19:41:06 +01:00
Alex Waygood
45bae29a4b
[red-knot] Fixup some formatting in infer.rs (#16348) 2025-02-24 14:44:49 +00:00
Alex Waygood
7059f4249b
[red-knot] Restrict visibility of more things in class.rs (#16346) 2025-02-24 14:30:56 +00:00
Mike Perlov
68991d09a8
[red-knot] Add diagnostic for class-object access to pure instance variables (#16036)
## Summary

Add a diagnostic if a pure instance variable is accessed on a class object. For example

```py
class C:
    instance_only: str

    def __init__(self):
        self.instance_only = "a"

# error: Attribute `instance_only` can only be accessed on instances, not on the class object `Literal[C]` itself.
C.instance_only
```


---------

Co-authored-by: David Peter <mail@david-peter.de>
2025-02-24 15:17:16 +01:00
Brent Westbrook
e7a6c19e3a
Add per-file-target-version option (#16257)
## Summary

This PR is another step in preparing to detect syntax errors in the
parser. It introduces the new `per-file-target-version` top-level
configuration option, which holds a mapping of compiled glob patterns to
Python versions. I intend to use the
`LinterSettings::resolve_target_version` method here to pass to the
parser:


f50849aeef/crates/ruff_linter/src/linter.rs (L491-L493)

## Test Plan

I added two new CLI tests to show that the `per-file-target-version` is
respected in both the formatter and the linter.
2025-02-24 08:47:13 -05:00
Vasco Schiavo
42a5f5ef6a
[PLW1507] Mark fix unsafe (#16343) 2025-02-24 13:42:44 +01:00
Alex Waygood
5bac4f6bd4
[red-knot] Add a test to ensure that KnownClass::try_from_file_and_name() is kept up to date (#16326) 2025-02-24 12:14:20 +00:00
Micha Reiser
320a3c68ae
Extract class and instance types (#16337) 2025-02-24 11:36:20 +00:00
David Peter
141ba253da
[red-knot] Add support for @classmethods (#16305)
## Summary

Add support for `@classmethod`s.

```py
class C:
    @classmethod
    def f(cls, x: int) -> str:
        return "a"

reveal_type(C.f(1))  # revealed: str
```

## Test Plan

New Markdown tests
2025-02-24 09:55:34 +01:00
Micha Reiser
5eaf225fc3
Update Salsa part 1 (#16340) 2025-02-24 09:35:21 +01:00
Vasco Schiavo
b312b53c2e
[flake8-pyi] Mark PYI030 fix unsafe when comments are deleted (#16322) 2025-02-23 21:22:14 +00:00
InSync
c814745643
[flake8-self] Ignore attribute accesses on instance-like variables (SLF001) (#16149) 2025-02-23 10:00:49 +00:00
Ari Pollak
aa88f2dbe5
Fix example for S611 (#16316)
## Summary

* Existing example did not include RawSQL() call like it should
* Also clarify the example a bit to make it clearer that the code is not
secure
## Test Plan

N/A, only documentation updated
2025-02-22 14:15:29 -05:00
Alex Waygood
64effa4aea
[red-knot] Add a regression test for recent improvement to TypeInferenceBuilder::infer_name_load() (#16310) 2025-02-21 22:28:42 +00:00
Alex Waygood
224a36f5f3
Teach red-knot that type(x) is the same as x.__class__ (#16301) 2025-02-21 21:05:48 +00:00
Alex Waygood
5347abc766
[red-knot] Generalise special-casing for KnownClasses in Type::bool (#16300) 2025-02-21 20:46:36 +00:00
Micha Reiser
5fab97f1ef
[red-knot] Diagnostics for incorrect bool usages (#16238) 2025-02-21 19:26:05 +01:00
David Peter
3aa7ba31b1
[red-knot] Fix descriptor __get__ call on class objects (#16304)
## Summary

I spotted a minor mistake in my descriptor protocol implementation where
`C.descriptor` would pass the meta type (`type`) of the type of `C`
(`Literal[C]`) as the owner argument to `__get__`, instead of passing
`Literal[C]` directly.

## Test Plan

New test.
2025-02-21 15:35:41 +01:00
Douglas Creager
4dae09ecff
[red-knot] Better handling of visibility constraint copies (#16276)
Two related changes.  For context:

1. We were maintaining two separate arenas of `Constraint`s in each
use-def map. One was used for narrowing constraints, and the other for
visibility constraints. The visibility constraint arena was interned,
ensuring that we always used the same ID for any particular
`Constraint`. The narrowing constraint arena was not interned.

2. The TDD code relies on _all_ TDD nodes being interned and reduced.
This is an important requirement for TDDs to be a canonical form, which
allows us to use a single int comparison to test for "always true/false"
and to compare two TDDs for equivalence. But we also need to support an
individual `Constraint` having multiple values in a TDD evaluation (e.g.
to handle a `while` condition having different values the first time
it's evaluated vs later times). Previously, we handled that by
introducing a "copy" number, which was only there as a disambiguator, to
allow an interned, deduplicated constraint ID to appear in the TDD
formula multiple times.

A better way to handle (2) is to not intern the constraints in the
visibility constraint arena! The caller now gets to decide: if they add
a `Constraint` to the arena more than once, they get distinct
`ScopedConstraintId`s — which the TDD code will treat as distinct
variables, allowing them to take on different values in the ternary
function.

With that in place, we can then consolidate on a single (non-interned)
arena, which is shared for both narrowing and visibility constraints.

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2025-02-21 09:16:25 -05:00
Darius Carrier
b9b094869a
[pylint] Fix false positives, add missing methods, and support positional-only parameters (PLE0302) (#16263)
## Summary

Resolves 3/4 requests in #16217:

-  Remove not special methods: `__cmp__`, `__div__`, `__nonzero__`, and
`__unicode__`.
-  Add special methods: `__next__`, `__buffer__`, `__class_getitem__`,
`__mro_entries__`, `__release_buffer__`, and `__subclasshook__`.
-  Support positional-only arguments.
-  Add support for module functions `__dir__` and `__getattr__`. As
mentioned in the issue the check is scoped for methods rather than
module functions. I am hesitant to expand the scope of this check
without a discussion.

## Test Plan

- Manually confirmed each example file from the issue functioned as
expected.
- Ran cargo nextest to ensure `unexpected_special_method_signature` test
still passed.

Fixes #16217.
2025-02-21 08:38:51 -05:00
Alex Waygood
b3c5932fda
[red-knot] Restrict visibility of the module_type_symbols function (#16290) 2025-02-21 10:55:22 +00:00
Alex Waygood
fe3ae587ea
[red-knot] Fix subtle detail in where the types.ModuleType attribute lookup should happen in TypeInferenceBuilder::infer_name_load() (#16284) 2025-02-21 10:48:52 +00:00
Dhruv Manilawala
c2b9fa84f7
Refactor workspace logic into workspace.rs (#16295)
## Summary

This is just a small refactor to move workspace related structs and impl
out from `server.rs` where `Server` is defined and into a new
`workspace.rs`.
2025-02-21 08:37:29 +00:00
Victorien
793264db13
[ruff] Add more Pydantic models variants to the list of default copy semantics (RUF012) (#16291) 2025-02-21 08:28:13 +01:00
David Peter
d2e034adcd
[red-knot] Method calls and the descriptor protocol (#16121)
## Summary

This PR achieves the following:

* Add support for checking method calls, and inferring return types from
method calls. For example:
  ```py
  reveal_type("abcde".find("abc"))  # revealed: int
  reveal_type("foo".encode(encoding="utf-8"))  # revealed: bytes
  
  "abcde".find(123)  # error: [invalid-argument-type]
  
  class C:
      def f(self) -> int:
          pass
  
  reveal_type(C.f)  # revealed: <function `f`>
  reveal_type(C().f)  # revealed: <bound method: `f` of `C`>
  
  C.f()  # error: [missing-argument]
  reveal_type(C().f())  # revealed: int
  ```
* Implement the descriptor protocol, i.e. properly call the `__get__`
method when a descriptor object is accessed through a class object or an
instance of a class. For example:
  ```py
  from typing import Literal
  
  class Ten:
def __get__(self, instance: object, owner: type | None = None) ->
Literal[10]:
          return 10
  
  class C:
      ten: Ten = Ten()
  
  reveal_type(C.ten)  # revealed: Literal[10]
  reveal_type(C().ten)  # revealed: Literal[10]
  ```
* Add support for member lookup on intersection types.
* Support type inference for `inspect.getattr_static(obj, attr)` calls.
This was mostly used as a debugging tool during development, but seems
more generally useful. It can be used to bypass the descriptor protocol.
For the example above:
  ```py
  from inspect import getattr_static
  
  reveal_type(getattr_static(C, "ten"))  # revealed: Ten
  ```
* Add a new `Type::Callable(…)` variant with the following sub-variants:
* `Type::Callable(CallableType::BoundMethod(…))` — represents bound
method objects, e.g. `C().f` above
* `Type::Callable(CallableType::MethodWrapperDunderGet(…))` — represents
`f.__get__` where `f` is a function
* `Type::Callable(WrapperDescriptorDunderGet)` — represents
`FunctionType.__get__`
* Add new known classes:
  * `types.MethodType`
  * `types.MethodWrapperType`
  * `types.WrapperDescriptorType`
  * `builtins.range`

## Performance analysis

On this branch, we do more work. We need to do more call checking, since
we now check all method calls. We also need to do ~twice as many member
lookups, because we need to check if a `__get__` attribute exists on
accessed members.

A brief analysis on `tomllib` shows that we now call `Type::call` 1780
times, compared to 612 calls before.

## Limitations

* Data descriptors are not yet supported, i.e. we do not infer correct
types for descriptor attribute accesses in `Store` context and do not
check writes to descriptor attributes. I felt like this was something
that could be split out as a follow-up without risking a major
architectural change.
* We currently distinguish between `Type::member` (with descriptor
protocol) and `Type::static_member` (without descriptor protocol). The
former corresponds to `obj.attr`, the latter corresponds to
`getattr_static(obj, "attr")`. However, to model some details correctly,
we would also need to distinguish between a static member lookup *with*
and *without* instance variables. The lookup without instance variables
corresponds to `find_name_in_mro`
[here](https://docs.python.org/3/howto/descriptor.html#invocation-from-an-instance).
We currently approximate both using `member_static`, which leads to two
open TODOs. Changing this would be a larger refactoring of
`Type::own_instance_member`, so I chose to leave it out of this PR.

## Test Plan

* New `call/methods.md` test suite for method calls
* New tests in `descriptor_protocol.md`
* New `call/getattr_static.md` test suite for `inspect.getattr_static`
* Various updated tests
2025-02-20 23:22:26 +01:00
David Peter
f62e5406f2
[red-knot] Short-circuit bool calls on bool (#16292)
## Summary

This avoids looking up `__bool__` on class `bool` for every
`Type::Instance(bool).bool()` call. 1% performance win on cold cache, 4%
win on incremental performance.
2025-02-20 23:06:11 +01:00
Douglas Creager
1be4394155
[red-knot] Consolidate SymbolBindings/SymbolDeclarations state (#16286)
This updates the `SymbolBindings` and `SymbolDeclarations` types to use
a single smallvec of live bindings/declarations, instead of splitting
that out into separate containers for each field.

I'm seeing an 11-13% `cargo bench` performance improvement with this
locally (for both cold and incremental). I'm interested to see if
Codspeed agrees!

---------

Co-authored-by: Carl Meyer <carl@astral.sh>
2025-02-20 16:20:23 -05:00
Micha Reiser
470f852f04
[red-knot] Prevent cross-module query dependencies in own_instance_member (#16268) 2025-02-20 18:46:45 +01:00
Douglas Creager
529950fba1
[red-knot] Separate definitions_by_definition into separate fields (#16277)
A minor cleanup that breaks up a `HashMap` of an enum into separate
`HashMap`s for each variant. (These separate fields were already how
this cache was being described in the big comment at the top of the
file!)
2025-02-20 09:47:01 -05:00
Andrew Gallant
205222ca6b red_knot_python_semantic: avoid adding callable_ty to CallBindingError
This is a small tweak to avoid adding the callable `Type` on the error
value itself. Namely, it's always available regardless of the error, and
it's easy to pass it down explicitly to the diagnostic generating code.

It's likely that the other `CallBindingError` variants will also want
the callable `Type` to improve diagnostics too. This way, we don't have
to duplicate the `Type` on each variant. It's just available to all of
them.

Ref https://github.com/astral-sh/ruff/pull/16239#discussion_r1962352646
2025-02-20 08:18:59 -05:00
Brent Westbrook
54fccb3ee2
Bump version to 0.9.7 (#16271) 2025-02-20 08:12:11 -05:00
David Peter
8198668fc3
[red-knot] MDTest: Use custom class names instead of builtins (#16269)
## Summary

Follow up on the discussion
[here](https://github.com/astral-sh/ruff/pull/16121#discussion_r1962973298).
Replace builtin classes with custom placeholder names, which should
hopefully make the tests a bit easier to understand.

I carefully renamed things one after the other, to make sure that there
is no functional change in the tests.
2025-02-20 12:25:55 +00:00
Dhruv Manilawala
fc6b03c8da
Handle requests received after shutdown message (#16262)
## Summary

This PR should help in
https://github.com/astral-sh/ruff-vscode/issues/676.

There are two issues that this is trying to fix all related to the way
shutdown should happen as per the protocol:
1. After the server handled the [shutdown
request](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#shutdown)
and while waiting for the exit notification:
	
> If a server receives requests after a shutdown request those requests
should error with `InvalidRequest`.
    
But, we raised an error and exited. This PR fixes it by entering a loop
which responds to any request during this period with `InvalidRequest`

2. If the server received an [exit
notification](https://microsoft.github.io/language-server-protocol/specifications/lsp/3.17/specification/#exit)
but the shutdown request was never received, the server handled that by
logging and exiting with success but as per the spec:

> The server should exit with success code 0 if the shutdown request has
been received before; otherwise with error code 1.

    So, this PR fixes that as well by raising an error in this case.

## Test Plan

I'm not sure how to go about testing this without using a mock server.
2025-02-20 11:10:42 +00:00
Micha Reiser
fb09d63e55
[red-knot] Prefix Type::call and dunder_call with try (#16261) 2025-02-20 09:05:04 +00:00
Alex Waygood
16d0625dfb
Improve internal docs for various string-node APIs (#16256) 2025-02-19 16:13:45 +00:00
Alex Waygood
25920fe489
Rename ExprStringLiteral::as_unconcatenated_string() to ExprStringLiteral::as_single_part_string() (#16253) 2025-02-19 16:06:57 +00:00
Brent Westbrook
97d0659ce3
Pass ParserOptions to the parser (#16220)
## Summary

This is part of the preparation for detecting syntax errors in the
parser from https://github.com/astral-sh/ruff/pull/16090/. As suggested
in [this
comment](https://github.com/astral-sh/ruff/pull/16090/#discussion_r1953084509),
I started working on a `ParseOptions` struct that could be stored in the
parser. For this initial refactor, I only made it hold the existing
`Mode` option, but for syntax errors, we will also need it to have a
`PythonVersion`. For that use case, I'm picturing something like a
`ParseOptions::with_python_version` method, so you can extend the
current calls to something like

```rust
ParseOptions::from(mode).with_python_version(settings.target_version)
```

But I thought it was worth adding `ParseOptions` alone without changing
any other behavior first.

Most of the diff is just updating call sites taking `Mode` to take
`ParseOptions::from(Mode)` or those taking `PySourceType`s to take
`ParseOptions::from(PySourceType)`. The interesting changes are in the
new `parser/options.rs` file and smaller parts of `parser/mod.rs` and
`ruff_python_parser/src/lib.rs`.

## Test Plan

Existing tests, this should not change any behavior.
2025-02-19 10:50:50 -05:00
Douglas Creager
cfc6941d5c
[red-knot] Resolve references in eager nested scopes eagerly (#16079)
We now resolve references in "eager" scopes correctly — using the
bindings and declarations that are visible at the point where the eager
scope is created, not the "public" type of the symbol (typically the
bindings visible at the end of the scope).

---------

Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2025-02-19 10:22:30 -05:00
Alex Waygood
f50849aeef
Add text_len() methods to more *Prefix enums in ruff_python_ast (#16254) 2025-02-19 14:47:07 +00:00
Micha Reiser
55ea09401a
[red-knot] Allow any Ranged argument for report_lint and report_diagnostic (#16252) 2025-02-19 14:34:56 +01:00