## Summary
This is the second PR out of three that adds support for
enabling/disabling lint rules in Red Knot. You may want to take a look
at the [first PR](https://github.com/astral-sh/ruff/pull/14869) in this
stack to familiarize yourself with the used terminology.
This PR adds a new syntax to define a lint:
```rust
declare_lint! {
/// ## What it does
/// Checks for references to names that are not defined.
///
/// ## Why is this bad?
/// Using an undefined variable will raise a `NameError` at runtime.
///
/// ## Example
///
/// ```python
/// print(x) # NameError: name 'x' is not defined
/// ```
pub(crate) static UNRESOLVED_REFERENCE = {
summary: "detects references to names that are not defined",
status: LintStatus::preview("1.0.0"),
default_level: Level::Warn,
}
}
```
A lint has a name and metadata about its status (preview, stable,
removed, deprecated), the default diagnostic level (unless the
configuration changes), and documentation. I use a macro here to derive
the kebab-case name and extract the documentation automatically.
This PR doesn't yet add any mechanism to discover all known lints. This
will be added in the next and last PR in this stack.
## Documentation
I documented some rules but then decided that it's probably not my best
use of time if I document all of them now (it also means that I play
catch-up with all of you forever). That's why I left some rules
undocumented (marked with TODO)
## Where is the best place to define all lints?
I'm not sure. I think what I have in this PR is fine but I also don't
love it because most lints are in a single place but not all of them. If
you have ideas, let me know.
## Why is the message not part of the lint, unlike Ruff's `Violation`
I understand that the main motivation for defining `message` on
`Violation` in Ruff is to remove the need to repeat the same message
over and over again. I'm not sure if this is an actual problem. Most
rules only emit a diagnostic in a single place and they commonly use
different messages if they emit diagnostics in different code paths,
requiring extra fields on the `Violation` struct.
That's why I'm not convinced that there's an actual need for it and
there are alternatives that can reduce the repetition when creating a
diagnostic:
* Create a helper function. We already do this in red knot with the
`add_xy` methods
* Create a custom `Diagnostic` implementation that tailors the entire
diagnostic and pre-codes e.g. the message
Avoiding an extra field on the `Violation` also removes the need to
allocate intermediate strings as it is commonly the place in Ruff.
Instead, Red Knot can use a borrowed string with `format_args`
## Test Plan
`cargo test`
## Summary
I used `codespell` and `gramma` to identify mispellings and grammar
errors throughout the codebase and fixed them. I tried not to make any
controversial changes, but feel free to revert as you see fit.
## Summary
It's common practice to name derive macros the same as the trait that they implement (`Debug`, `Display`, `Eq`, `Serialize`, ...).
This PR renames the `ConfigurationOptions` derive macro to `OptionsMetadata` to match the trait name.
## Test Plan
`cargo build`
* Document codes.rs
* Refactor codes.rs before merging
Helper script:
```python
# %%
from pathlib import Path
codes = Path("crates/ruff/src/codes.rs").read_text().splitlines()
rules = Path("a.txt").read_text().strip().splitlines()
rule_map = {i.split("::")[-1]: i for i in rules}
# %%
codes_new = []
for line in codes:
if ", Rule::" in line:
left, right = line.split(", Rule::")
right = right[:-2]
line = left + ", " + rule_map[right] + "),"
codes_new.append(line)
# %%
Path("crates/ruff/src/codes.rs").write_text("\n".join(codes_new))
```
Co-authored-by: Jonathan Plasse <13716151+JonathanPlasse@users.noreply.github.com>
This PR introduces a new `CacheKey` trait for types that can be used as a cache key.
I'm not entirely sure if this is worth the "overhead", but I was surprised to find `HashableHashSet` and got scared when I looked at the time complexity of the `hash` function. These implementations must be extremely slow in hashed collections.
I then searched for usages and quickly realized that only the cache uses these `Hash` implementations, where performance is less sensitive.
This PR introduces a new `CacheKey` trait to communicate the difference between a hash and computing a key for the cache. The new trait can be implemented for types that don't implement `Hash` for performance reasons, and we can define additional constraints on the implementation: For example, we'll want to enforce portability when we add remote caching support. Using a different trait further allows us not to implement it for types without stable identities (e.g. pointers) or use other implementations than the standard hash function.
Currently the define_rule_mapping! macro generates both the Rule enum as
well as the RuleCodePrefix enum and the mapping between the two. After
this commit series the macro will only generate the Rule enum and the
RuleCodePrefix enum and the mapping will be generated by a new map_codes
proc macro, so we rename the macro now to fit its new purpose.
```console
❯ cargo run rule B017
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/ruff rule B017`
no-assert-raises-exception
Code: B017 (flake8-bugbear)
### What it does
Checks for `self.assertRaises(Exception)`.
## Why is this bad?
`assertRaises(Exception)` can lead to your test passing even if the
code being tested is never executed due to a typo.
Either assert for a more specific exception (builtin or custom), use
`assertRaisesRegex` or the context manager form of `assertRaises`.
```