Commit graph

11 commits

Author SHA1 Message Date
renovate[bot]
388658efdb
Update pre-commit dependencies (#10698)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Zanie Blue <contact@zanie.dev>
Co-authored-by: Alex Waygood <alex.waygood@gmail.com>
2024-04-06 23:00:41 +00:00
Charlie Marsh
7c12eaf322
Use characters instead of u32 in confusable map (#8463) 2023-11-03 09:57:47 -04:00
Christopher Covington
9f30ccc1f4
Autoformat confusable units (#4430)
I've seen errors crop up from using the different micro and mu
characters. Follow matching recommendations on which character to prefer
for micro, ohm, and angstrom. References:
* Section 22.2 Letterlike Symbols, subsection Unit Symbols, page 877 of
[The Unicode Standard, Version 15.0

](https://www.unicode.org/versions/Unicode15.0.0/UnicodeStandard-15.0.pdf)
* Section 2.5 Duplicated Characters of [Unicode Technical Report
25](https://www.unicode.org/reports/tr25/)
* [SI
brochure](https://www.bipm.org/documents/20126/41483022/SI-Brochure-9-EN.pdf)
*
https://github.com/unicode-org/icu/blob/main/icu4c/source/data/unidata/confusables.txt
2023-11-03 04:58:43 +00:00
Charlie Marsh
31286e1c95
Re-run scripts/update_ambiguous_characters.py (#8459)
These weren't formatted consistently, and when I re-ran, the formatting
changed a bit, so I'm editing the script to keep that file constant.
2023-11-03 04:50:10 +00:00
Charlie Marsh
5849a75223
Rename ruff crate to ruff_linter (#7529) 2023-09-20 08:38:27 +02:00
Charlie Marsh
574c0e0105
Use match instead of phf for confusable lookup (#5953)
I don't know whether we want to make this change but here's some data...

Binary size:

- `main`: 30,384
- `charlie/match-phf`: 30,416

llvm-lines:

- `main`: 1,784,148
- `charlie/match-phf`: 1,789,877

llvm-lines and binary size are both unchanged (or, by < 5) when moving
from `u8` to `u32` return types, and even when moving to `char` keys and
values. I didn't expect this, but I'm not very knowledgable on this
topic.

Performance:

```
Confusables/match/src   time:   [4.9102 µs 4.9352 µs 4.9777 µs]
                        change: [+1.7469% +2.2421% +2.8710%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  6 (6.00%) high severe
Confusables/match-with-skip/src
                        time:   [2.0676 µs 2.0945 µs 2.1317 µs]
                        change: [+0.9384% +1.6000% +2.3920%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe
Confusables/phf/src     time:   [31.087 µs 31.188 µs 31.305 µs]
                        change: [+1.9262% +2.2188% +2.5496%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  3 (3.00%) low mild
  6 (6.00%) high mild
  6 (6.00%) high severe
Confusables/phf-with-skip/src
                        time:   [2.0470 µs 2.0486 µs 2.0502 µs]
                        change: [-0.3093% -0.1446% +0.0106%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe
```

The `-with-skip` variants add our optimization which first checks
whether the character is ASCII. So `match` is way, way faster than PHF,
but it tends not to matter since almost all source code is ASCII anyway.
2023-07-24 02:23:36 +00:00
qdegraaf
93b2bd7184
[perflint] Add PERF401 and PERF402 rules (#5298)
## Summary

Adds `PERF401` and `PERF402` mirroring `W8401` and `W8402` from
https://github.com/tonybaloney/perflint

Implementation is not super smart but should be at parity with upstream
implementation judging by:
c07391c176/perflint/comprehension_checker.py (L42-L73)

It essentially checks:

- If the body of a for-loop is just one statement
- If that statement is an `if` and the if-statement contains a call to
`append()` we flag `PERF401` and suggest a list comprehension
- If that statement is a plain call to `append()` or `insert()` we flag
`PERF402` and suggest `list()` or `list.copy()`

I've set the violation to only flag the first append call in a long
`if-else` statement for `PERF401`. Happy to change this to some other
location or make it multiple violations if that makes more sense.

## Test Plan

Fixtures were added with the relevant scenarios for both rules

## Issue Links

Refers: https://github.com/astral-sh/ruff/issues/4789
2023-07-03 04:03:09 +00:00
Charlie Marsh
f9f0cf7524
Use __future__ imports in scripts (#5301) 2023-06-22 11:40:16 -04:00
konstin
651d89794c
Use phf for confusables to reduce llvm lines (#4926)
* Use phf for confusables to reduce llvm lines

## Summary

This replaces FxHashMap for the confusables with a perfect hash map from the [phf crate](https://github.com/rust-phf/rust-phf) to reduce the generated llvm instructions.

A perfect hash function is one that doesn't have any collisions. We can build one because we know all keys at compile time. This improves hashmap efficiency, even though this is likely not noticeable in our case (except someone has a large non-english crate to test on).

The original hashmap contained a lot of duplicates, which i had to remove when phf_map complained, i did so by sorting the keys.

The important part that it reduces the llvm instructions generated (#3808, `RUSTFLAGS="-Csymbol-mangling-version=v0" cargo llvm-lines -p ruff --lib | head -20`):

```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1740502                38973                (TOTAL)
    27423 (1.6%,  1.6%)      1 (0.0%,  0.0%)  ruff[cef4c65d96248843]::rules::ruff::rules::confusables::CONFUSABLES::{closure#0}
    10193 (0.6%,  2.2%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::RuleCodePrefix>::iter
     8107 (0.5%,  2.6%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule>::noqa_code
     7345 (0.4%,  3.0%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_stmt
     6412 (0.4%,  3.4%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
     6412 (0.4%,  3.8%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
     6409 (0.4%,  4.2%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
     5696 (0.3%,  4.5%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_expr
     4448 (0.3%,  4.7%)      1 (0.0%,  0.0%)  ruff[cef4c65d96248843]::flake8_to_ruff::converter::convert
     3702 (0.2%,  4.9%)      1 (0.0%,  0.0%)  <&ruff[cef4c65d96248843]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3349 (0.2%,  5.1%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::registry::Linter>::code_for_rule
     3132 (0.2%,  5.3%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3130 (0.2%,  5.5%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[cef4c65d96248843]::codes::Rule>>::from
     3130 (0.2%,  5.7%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[cef4c65d96248843]::codes::Rule>>::from
     3130 (0.2%,  5.9%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3128 (0.2%,  6.0%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::RuleIter>::get
     2669 (0.2%,  6.2%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
```
After:
```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1710487                38900                (TOTAL)
    10193 (0.6%,  0.6%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::RuleCodePrefix>::iter
     8107 (0.5%,  1.1%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule>::noqa_code
     7345 (0.4%,  1.5%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_stmt
     6412 (0.4%,  1.9%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
     6412 (0.4%,  2.2%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
     6409 (0.4%,  2.6%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
     5696 (0.3%,  3.0%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_expr
     4448 (0.3%,  3.2%)      1 (0.0%,  0.0%)  ruff[52408f46d2058296]::flake8_to_ruff::converter::convert
     3702 (0.2%,  3.4%)      1 (0.0%,  0.0%)  <&ruff[52408f46d2058296]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3349 (0.2%,  3.6%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::registry::Linter>::code_for_rule
     3132 (0.2%,  3.8%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3130 (0.2%,  4.0%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[52408f46d2058296]::codes::Rule>>::from
     3130 (0.2%,  4.2%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[52408f46d2058296]::codes::Rule>>::from
     3130 (0.2%,  4.4%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3128 (0.2%,  4.5%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::RuleIter>::get
     2669 (0.2%,  4.7%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
     2659 (0.2%,  4.9%)      1 (0.0%,  0.0%)  <&ruff[52408f46d2058296]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
```

I'd assume this has a positive effect both on compile time and on runtime, but i don't know the actual effect on compile times and can't really measure.

## Test plan

Check CI for any performance regressions.

This should fix #3808 if we merge it.

* clippy

* Update update_ambiguous_characters.py
2023-06-08 08:13:20 +02:00
konstin
550b643e33
Add script for ecosystem wide checks of all rules and fixes (#4326)
* Add script for ecosystem wide checks of all rules and fixes

This adds my personal script for checking an entire checkout of ~2.1k packages for
panics, autofix errors and similar problems. It's not really meant to be used by anybody else but i thought it's better if it lives in the repo than if it doesn't.

For reference, this is the current output of failing autofixes: https://gist.github.com/konstin/c3fada0135af6cacec74f166adf87a00. Trimmed down to the useful information: https://gist.github.com/konstin/c864f4c300c7903a24fdda49635c5da9

* Keep github template intact

* Remove the need for ripgrep

* sort output
2023-05-22 15:23:25 +02:00
Aarni Koskela
d7a369e7dc
Update confusable character mapping (#4274) 2023-05-08 14:20:44 -04:00