## Summary
Add copyright notice detection to enforce the presence of copyright
headers in Python files.
Configurable settings include: the relevant regular expression, the
author name, and the minimum file size, similar to
[flake8-copyright](https://github.com/savoirfairelinux/flake8-copyright).
Closes https://github.com/charliermarsh/ruff/issues/3579
---------
Signed-off-by: ryan <ryang@waabi.ai>
Co-authored-by: Charlie Marsh <charlie.r.marsh@gmail.com>
* Document codes.rs
* Refactor codes.rs before merging
Helper script:
```python
# %%
from pathlib import Path
codes = Path("crates/ruff/src/codes.rs").read_text().splitlines()
rules = Path("a.txt").read_text().strip().splitlines()
rule_map = {i.split("::")[-1]: i for i in rules}
# %%
codes_new = []
for line in codes:
if ", Rule::" in line:
left, right = line.split(", Rule::")
right = right[:-2]
line = left + ", " + rule_map[right] + "),"
codes_new.append(line)
# %%
Path("crates/ruff/src/codes.rs").write_text("\n".join(codes_new))
```
Co-authored-by: Jonathan Plasse <13716151+JonathanPlasse@users.noreply.github.com>
## Summary
This PR moves `Diagnostic`, `DiagnosticKind`, and `Fix` into their own crate, which will enable us to further split up Ruff, since sub-linter crates (which need to implement functions that return `Diagnostic`) can now depend on `ruff_diagnostics` rather than Ruff.
This PR introduces a new `CacheKey` trait for types that can be used as a cache key.
I'm not entirely sure if this is worth the "overhead", but I was surprised to find `HashableHashSet` and got scared when I looked at the time complexity of the `hash` function. These implementations must be extremely slow in hashed collections.
I then searched for usages and quickly realized that only the cache uses these `Hash` implementations, where performance is less sensitive.
This PR introduces a new `CacheKey` trait to communicate the difference between a hash and computing a key for the cache. The new trait can be implemented for types that don't implement `Hash` for performance reasons, and we can define additional constraints on the implementation: For example, we'll want to enforce portability when we add remote caching support. Using a different trait further allows us not to implement it for types without stable identities (e.g. pointers) or use other implementations than the standard hash function.
For example:
$ ruff check --select=EM<Tab>
EM -- flake8-errmsg
EM10 EM1 --
EM101 -- raw-string-in-exception
EM102 -- f-string-in-exception
EM103 -- dot-format-in-exception
(You will need to enable autocompletion as described
in the Autocompletion section in the README.)
Fixes#2808.
(The --help help change in the README is due to a clap bug,
for which I already submitted a fix:
https://github.com/clap-rs/clap/pull/4710.)
Currently the define_rule_mapping! macro generates both the Rule enum as
well as the RuleCodePrefix enum and the mapping between the two. After
this commit series the macro will only generate the Rule enum and the
RuleCodePrefix enum and the mapping will be generated by a new map_codes
proc macro, so we rename the macro now to fit its new purpose.
Rule::noqa_code previously return a single &'static str,
which was possible because we had one enum listing all
rule code prefixes. This commit series will however split up
the RuleCodePrefix enum into several enums ... so we'll end up
with two &'static str ... this commit wraps the return type
of Rule::noqa_code into a newtype so that we can easily change
it to return two &'static str in the 6th commit of this series.
Post this commit series several codes can be mapped to a single rule,
this commit therefore renames Rule::code to Rule::noqa_code,
which is the code that --add-noqa will add to ignore a rule.
if_all_same(codes.values().cloned()).unwrap_or_default()
was quite unreadable because it wasn't obvious that codes.values() are
the prefixes. It's better to introduce another Map rather than having
Maps within Maps.