## Summary
This changeset adds support for precise type-inference and
boundness-handling of definitions inside control-flow branches with
statically-known conditions, i.e. test-expressions whose truthiness we
can unambiguously infer as *always false* or *always true*.
This branch also includes:
- `sys.platform` support
- statically-known branches handling for Boolean expressions and while
loops
- new `target-version` requirements in some Markdown tests which were
now required due to the understanding of `sys.version_info` branches.
closes#12700closes#15034
## Performance
### `tomllib`, -7%, needs to resolve one additional module (sys)
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./red_knot_main --project /home/shark/tomllib` | 22.2 ± 1.3 | 19.1 |
25.6 | 1.00 |
| `./red_knot_feature --project /home/shark/tomllib` | 23.8 ± 1.6 | 20.8
| 28.6 | 1.07 ± 0.09 |
### `black`, -6%
| Command | Mean [ms] | Min [ms] | Max [ms] | Relative |
|:---|---:|---:|---:|---:|
| `./red_knot_main --project /home/shark/black` | 129.3 ± 5.1 | 119.0 |
137.8 | 1.00 |
| `./red_knot_feature --project /home/shark/black` | 136.5 ± 6.8 | 123.8
| 147.5 | 1.06 ± 0.07 |
## Test Plan
- New Markdown tests for the main feature in
`statically-known-branches.md`
- New Markdown tests for `sys.platform`
- Adapted tests for `EllipsisType`, `Never`, etc
## Summary
This PR fixes an issue where Ruff's `D403` rule
(`first-word-uncapitalized`) was not detecting some single-word edge
cases that are picked up by `pydocstyle`.
The change involves extracting the first word of the docstring by
identifying the first whitespace character. This is consistent with
`pydocstyle` which uses `.split()` - see
8d0cdfc93e/src/pydocstyle/checker.py (L581C13-L581C64)
## Example
Here is a playground example -
https://play.ruff.rs/eab9ea59-92cf-4e44-b1a9-b54b7f69b178
```py
def example1():
"""foo"""
def example2():
"""foo
Hello world!
"""
def example3():
"""foo bar
Hello world!
"""
def example4():
"""
foo
"""
def example5():
"""
foo bar
"""
```
`pydocstyle` detects all five cases:
```bash
$ pydocstyle test.py --select D403
dev/test.py:2 in public function `example1`:
D403: First word of the first line should be properly capitalized ('Foo', not 'foo')
dev/test.py:5 in public function `example2`:
D403: First word of the first line should be properly capitalized ('Foo', not 'foo')
dev/test.py:11 in public function `example3`:
D403: First word of the first line should be properly capitalized ('Foo', not 'foo')
dev/test.py:17 in public function `example4`:
D403: First word of the first line should be properly capitalized ('Foo', not 'foo')
dev/test.py:22 in public function `example5`:
D403: First word of the first line should be properly capitalized ('Foo', not 'foo')
```
Ruff (`0.8.4`) fails to catch example2 and example4.
## Test Plan
* Added two new test cases to cover the previously missed single-word
docstring cases.
## Summary
Refer:
https://github.com/astral-sh/ruff/issues/13773#issuecomment-2548020368
This PR adds support for unpacking union types.
Unpacking a union type requires us to first distribute the types for all
the targets that are involved in an unpacking. For example, if there are
two targets and a union type that needs to be unpacked, each target will
get a type from each element in the union type.
For example, if the type is `tuple[int, int] | tuple[int, str]` and the
target has two elements `(a, b)`, then
* The type of `a` will be a union of `int` and `int` which are at index
0 in the first and second tuple respectively which resolves to an `int`.
* Similarly, the type of `b` will be a union of `int` and `str` which
are at index 1 in the first and second tuple respectively which will be
`int | str`.
### Refactors
There are couple of refactors that are added in this PR:
* Add a `debug_assertion` to validate that the unpack target is a list
or a tuple
* Add a separate method to handle starred expression
## Test Plan
Update `unpacking.md` with additional test cases that uses union types.
This is done using parameter type hints style.
## Summary
This PR adds initial support for `type: ignore`. It doesn't do anything
fancy yet like:
* Detecting invalid type ignore comments
* Detecting type ignore comments that are part of another suppression
comment: `# fmt: skip # type: ignore`
* Suppressing specific lints `type: ignore [code]`
* Detecting unsused type ignore comments
* ...
The goal is to add this functionality in separate PRs.
## Test Plan
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
<!--
Thank you for contributing to Ruff! To help us out with reviewing,
please consider the following:
- Does this pull request include a summary of the change? (See below.)
- Does this pull request include a descriptive title?
- Does this pull request include references to any relevant issues?
-->
## Summary
<!-- What's the purpose of the change? What does it do, and why? -->
Fix#11482. Applies
https://github.com/adamchainz/flake8-comprehensions/pull/205 to ruff.
`C416` should be skipped if comprehension contains unpacking. Here's an
example:
```python
list_of_lists = [[1, 2], [3, 4]]
# ruff suggests `list(list_of_lists)` here, but that would change the result.
# `list(list_of_lists)` is not `[(1, 2), (3, 4)]`
a = [(x, y) for x, y in list_of_lists]
# This is equivalent to `list(list_of_lists)`
b = [x for x in list_of_lists]
```
## Test Plan
<!-- How was it tested? -->
Existing checks
---------
Signed-off-by: harupy <hkawamura0130@gmail.com>
## Summary
resolves#14883
This PR removes the known limitation section in the documentation of
`eq-without-hash`. That is not actually a limitation as a subclass
overriding the `__eq__` method would have its `__hash__` set to `None`
implicitly. The user should explicitly inherit the `__hash__` method
from the parent class.
## Test Plan
<img width="619" alt="Screenshot 2024-12-20 at 2 02 47 PM"
src="https://github.com/user-attachments/assets/552defcd-25e1-4153-9ab9-e5b9d5fbe8cc"
/>
---------
Co-authored-by: Dhruv Manilawala <dhruvmanila@gmail.com>
## Summary
Airflow 3.0 removes various deprecated functions, members, modules, and
other values. They have been deprecated in 2.x, but the removal causes
incompatibilities that we want to detect. This PR deprecates the
following names and add a function for removed methods
* `airflow.datasets.manager.DatasetManager.register_dataset_change` →
`airflow.assets.manager.AssetManager.register_asset_change`
* `airflow.datasets.manager.DatasetManager.create_datasets` →
`airflow.assets.manager.AssetManager.create_assets`
* `airflow.datasets.manager.DatasetManager.notify_dataset_created` →
`airflow.assets.manager.AssetManager.notify_asset_created`
* `airflow.datasets.manager.DatasetManager.notify_dataset_changed` →
`airflow.assets.manager.AssetManager.notify_asset_changed`
* `airflow.datasets.manager.DatasetManager.notify_dataset_alias_created`
→ `airflow.assets.manager.AssetManager.notify_asset_alias_created`
*
`airflow.providers.amazon.auth_manager.aws_auth_manager.AwsAuthManager.is_authorized_dataset`
→
`airflow.providers.amazon.auth_manager.aws_auth_manager.AwsAuthManager.is_authorized_asset`
* `airflow.lineage.hook.HookLineageCollector.create_dataset` →
`airflow.lineage.hook.HookLineageCollector.create_asset`
* `airflow.lineage.hook.HookLineageCollector.add_input_dataset` →
`airflow.lineage.hook.HookLineageCollector.add_input_asset`
* `airflow.lineage.hook.HookLineageCollector.add_output_dataset` →
`airflow.lineage.hook.HookLineageCollector.dd_output_asset`
* `airflow.lineage.hook.HookLineageCollector.collected_datasets` →
`airflow.lineage.hook.HookLineageCollector.collected_assets`
*
`airflow.providers_manager.ProvidersManager.initialize_providers_dataset_uri_resources`
→
`airflow.providers_manager.ProvidersManager.initialize_providers_asset_uri_resources`
## Test Plan
A test fixture is included in the PR.
When confronted with `raise from exc` the parser will now create a
`StmtRaise` that has `None` for the exception and `exc` for the cause.
Before, the parser created a `StmtRaise` with `from` for the exception,
no cause, and a spurious expression `exc` afterwards.
## Summary
A follow up PR on https://github.com/astral-sh/ruff/issues/14991
Ruff ignores hardcoded passwords for typed variables. Add a rule to
catch passwords in typed code bases
## Test Plan
Includes 2 more test typed variables
We have a handy `to_meta_type` that does the right thing for class
instances, and also works for all of the other types that are “instances
of” something. Unless I'm missing something, this should let us get rid
of the catch-all clause in one fell swoop.
cf #14548
## Summary
I'm currently on the fence about landing the #14760 PR because it's
unclear how we'd support tracking used and unused suppression comments
in a performant way:
* Salsa adds an "untracked" dependency to every query reading
accumulated values. This has the effect that the query re-runs on every
revision. For example, a possible future query
`unused_suppression_comments(db, file)` would re-run on every
incremental change and for every file. I don't expect the operation
itself to be expensive, but it all adds up in a project with 100k+ files
* Salsa collects the accumulated values by traversing the entire query
dependency graph. It can skip over sub-graphs if it is known that they
contain no accumulated values. This makes accumulators a great tool for
when they are rare; diagnostics are a good example. Unfortunately,
suppressions are more common, and they often appear in many different
files, making the "skip over subgraphs" optimization less effective.
Because of that, I want to wait to adopt salsa accumulators for type
check diagnostics (we could start using them for other diagnostics)
until we have very specific reasons that justify regressing incremental
check performance.
This PR does a "small" refactor that brings us closer to what I have in
#14760 but without using accumulators. To emit a diagnostic, a method
needs:
* Access to the db
* Access to the currently checked file
This PR introduces a new `InferContext` that holds on to the db, the
current file, and the reported diagnostics. It replaces the
`TypeCheckDiagnosticsBuilder`. We pass the `InferContext` instead of the
`db` to methods that *might* emit diagnostics. This simplifies some of
the `Outcome` methods, which can now be called with a context instead of
a `db` and the diagnostics builder. Having the `db` and the file on a
single type like this would also be useful when using accumulators.
This PR doesn't solve the issue that the `Outcome` types feel somewhat
complicated nor that it can be annoying when you need to report a
`Diagnostic,` but you don't have access to an `InferContext` (or the
file). However, I also believe that accumulators won't solve these
problems because:
* Even with accumulators, it's necessary to have a reference to the file
that's being checked. The struggle would be to get a reference to that
file rather than getting a reference to `InferContext`.
* Users of the `HasTy` trait (e.g., a linter) don't want to bother
getting the `File` when calling `Type::return_ty` because they aren't
interested in the created diagnostics. They just want to know what
calling the current expression would return (and if it even is a
callable). This is what the different methods of `Outcome` enable today.
I can ask for the return type without needing extra data that's only
relevant for emitting a diagnostic.
A shortcoming of this approach is that it is now a bit confusing when to
pass `db` and when an `InferContext`. An option is that we'd make the
`file` on `InferContext` optional (it won't collect any diagnostics if
`None`) and change all methods on `Type` to take `InferContext` as the
first argument instead of a `db`. I'm interested in your opinion on
this.
Accumulators are definitely harder to use incorrectly because they
remove the need to merge the diagnostics explicitly and there's no risk
that we accidentally merge the diagnostics twice, resulting in
duplicated diagnostics. I still value performance more over making our
life slightly easier.
Closes#14000
## Summary
For typing context bindings we know that they won't be available at
runtime. We shouldn't recommend a fix, that will result in name errors
at runtime.
## Test Plan
`cargo nextest run`
This tweaks the new semantics from #15026 a bit when a symbol could be
interpreted both as an attribute and a submodule of a package. For
`from...import`, we should actually prioritize the attribute, because of
how the statement itself is implemented [1].
> 1. check if the imported module has an attribute by that name
> 2. if not, attempt to import a submodule with that name and then check
the imported module again for that attribute
[1] https://docs.python.org/3/reference/simple_stmts.html#the-import-statement
## Summary
Fixes#14550.
Add `AlwaysTruthy` and `AlwaysFalsy` types, representing the set of objects whose `__bool__` method can only ever return `True` or `False`, respectively, and narrow `if x` and `if not x` accordingly.
## Test Plan
- New Markdown test for truthiness narrowing `narrow/truthiness.md`
- unit tests in `types.rs` and `builders.rs` (`cargo test --package
red_knot_python_semantic --lib -- types`)
## Summary
Fixes https://github.com/astral-sh/ruff/issues/15027
The `MemoryFileSystem::write_file` API automatically creates
non-existing ancestor directoryes
but we failed to update the status of the now created ancestor
directories in the `Files` data structure.
## Test Plan
Tested that the case in https://github.com/astral-sh/ruff/issues/15027
now passes regardless of whether the *Simple* case is commented out or
not
Fixes#15012.
```python
def f():
# panics when the code can't find the loop variable
values = [1, 2, 3]
result = []
for i in values:
result.append(i + 1)
del i
```
I'm not sure exactly why this test case panics, but I suspect the `del
i` removes the binding from the semantic model's symbols.
I changed the code to search for the correct binding by directly
iterating through the bindings. Since we know exactly which binding we
want, this should find the loop variable without any complications.
## Summary
This PR updates the logic when raising conflicting declarations
diagnostic to avoid the undeclared path if present.
The conflicting declaration diagnostics is added when there are two or
more declarations in the control flow path of a definition whose type
isn't equivalent to each other. This can be seen in the following
example:
```py
if flag:
x: int
x = 1 # conflicting-declarations: Unknown, int
```
After this PR, we'd avoid considering "Unknown" as part of the
conflicting declarations. This means we'd still flag it for the
following case:
```py
if flag:
x: int
else:
x: str
x = 1 # conflicting-declarations: int, str
```
A solution that's local to the exception control flow was also explored
which required updating the logic for merging the flow snapshot to avoid
considering declarations using a flag. This is preserved here:
https://github.com/astral-sh/ruff/compare/dhruv/control-flow-no-declarations?expand=1.
The main motivation to avoid that is we don't really understand what the
user experience is w.r.t. the Unknown type and the
conflicting-declaration diagnostics. This makes us unsure on what the
right semantics are as to whether that diagnostics should be raised or
not and when to raise them. For now, we've decided to move forward with
this PR and could decide to adopt another solution or remove the
conflicting-declaration diagnostics in the future.
Closes: #13966
## Test Plan
Update the existing mdtest case. Add an additional case specific to
exception control flow to verify that the diagnostic is not being raised
now.
When importing a nested module, we were correctly creating a binding for
the top-most parent, but we were binding that to the nested module, not
to that parent module. Moreover, we weren't treating those submodules as
members of their containing parents. This PR addresses both issues, so
that nested imports work as expected.
As discussed in ~Slack~ whatever chat app I find myself in these days
😄, this requires keeping track of which modules have been imported
within the current file, so that when we resolve member access on a
module reference, we can see if that member has been imported as a
submodule. If so, we return the submodule reference immediately, instead
of checking whether the parent module's definition defines the symbol.
This is currently done in a flow insensitive manner. The `SemanticIndex`
now tracks all of the modules that are imported (via `import`, not via
`from...import`). The member access logic mentioned above currently only
considers module imports in the file containing the attribute
expression.
---------
Co-authored-by: Carl Meyer <carl@astral.sh>
This PR introduces three changes to `D403`, which has to do with
capitalizing the first word in a docstring.
1. The diagnostic and fix now skip leading whitespace when determining
what counts as "the first word".
2. The name has been changed to `first-word-uncapitalized` from
`first-line-capitalized`, for both clarity and compliance with our rule
naming policy.
3. The diagnostic message and documentation has been modified slightly
to reflect this.
Closes#14890
Fixes#14969.
The issue was that this line:
```rust
let from_assign_to_loop = TextRange::new(binding_stmt.end(), for_stmt.start());
```
was not safe if the binding was after the target. The only way (at least
that I can think of) this can happen is if they are in different scopes,
so it now checks for that before checking if there are usages between
the two.
## Summary
The summary is misleading, as well as the
`whitespace-after-open-bracket` and `whitespace-before-close-bracket`
names - it's not only brackets, but also parentheses and braces. Align
the documentation with the actual behaviour.
Don't change the names, but align the documentation with the behaviour.
## Test Plan
No test (documentation).
## Summary
This change adds `name` and `default` functions to `TypeParam` to access
the corresponding attributes more conveniently. I currently have these
as helper functions in code built on top of ruff_python_ast, and they
seemed like they might be generally useful.
## Test Plan
Ran the checks listed in CONTRIBUTING.md#development.
---------
Co-authored-by: Micha Reiser <micha@reiser.io>
Co-authored-by: Alex Waygood <Alex.Waygood@Gmail.com>
## Summary
A class is an instance of its metaclass, so `ClassLiteral("ABC")` is not
disjoint from `Instance("ABCMeta")`. However, we erroneously consider
the two types disjoint on the `main` branch. This PR fixes that.
This bug was uncovered by adding some more core types to the property
tests that provide coverage for classes that have custom metaclasses.
The additions to the property tests are included in this PR.
## Test Plan
New unit tests and property tests added. Tested with:
- `cargo test -p red_knot_python_semantic`
- `QUICKCHECK_TESTS=100000 cargo test -p red_knot_python_semantic --
--ignored types::property_tests::stable`
The assignability property test fails on this branch, but that's a known
issue that exists on `main`, due to
https://github.com/astral-sh/ruff/issues/14899.
## Summary
Teach red-knot that `type[...]` is always disjoint from `None` and from
`LiteralString`. Fixes#14925.
This should properly be generalized to "all instances of final types
which are not subclasses of `type`", but until we support finality,
hardcoding `None` (which is known to be final) allows us to fix the
subtype transitivity property test.
## Test Plan
Existing tests pass, added new unit tests for `is_disjoint_from` and
`is_subtype_of`.
`QUICKCHECK_TESTS=100000 cargo test -p red_knot_python_semantic --
--ignored types::property_tests::stable` fails only the "assignability
is reflexive" test, which is known to fail on `main` (#14899).
The same command, with `property_tests.rs` edited to prevent generating
intersection tests (the cause of #14899), passes all quickcheck tests.