ruff/crates
konstin 651d89794c
Use phf for confusables to reduce llvm lines (#4926)
* Use phf for confusables to reduce llvm lines

## Summary

This replaces FxHashMap for the confusables with a perfect hash map from the [phf crate](https://github.com/rust-phf/rust-phf) to reduce the generated llvm instructions.

A perfect hash function is one that doesn't have any collisions. We can build one because we know all keys at compile time. This improves hashmap efficiency, even though this is likely not noticeable in our case (except someone has a large non-english crate to test on).

The original hashmap contained a lot of duplicates, which i had to remove when phf_map complained, i did so by sorting the keys.

The important part that it reduces the llvm instructions generated (#3808, `RUSTFLAGS="-Csymbol-mangling-version=v0" cargo llvm-lines -p ruff --lib | head -20`):

```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1740502                38973                (TOTAL)
    27423 (1.6%,  1.6%)      1 (0.0%,  0.0%)  ruff[cef4c65d96248843]::rules::ruff::rules::confusables::CONFUSABLES::{closure#0}
    10193 (0.6%,  2.2%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::RuleCodePrefix>::iter
     8107 (0.5%,  2.6%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule>::noqa_code
     7345 (0.4%,  3.0%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_stmt
     6412 (0.4%,  3.4%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
     6412 (0.4%,  3.8%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
     6409 (0.4%,  4.2%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
     5696 (0.3%,  4.5%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::checkers::ast::Checker as ruff_python_ast[3778b140caf21545]::visitor::Visitor>::visit_expr
     4448 (0.3%,  4.7%)      1 (0.0%,  0.0%)  ruff[cef4c65d96248843]::flake8_to_ruff::converter::convert
     3702 (0.2%,  4.9%)      1 (0.0%,  0.0%)  <&ruff[cef4c65d96248843]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3349 (0.2%,  5.1%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::registry::Linter>::code_for_rule
     3132 (0.2%,  5.3%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3130 (0.2%,  5.5%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[cef4c65d96248843]::codes::Rule>>::from
     3130 (0.2%,  5.7%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[cef4c65d96248843]::codes::Rule>>::from
     3130 (0.2%,  5.9%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3128 (0.2%,  6.0%)      1 (0.0%,  0.0%)  <ruff[cef4c65d96248843]::codes::RuleIter>::get
     2669 (0.2%,  6.2%)      1 (0.0%,  0.0%)  <<ruff[cef4c65d96248843]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
```
After:
```
  Lines                  Copies               Function name
  -----                  ------               -------------
  1710487                38900                (TOTAL)
    10193 (0.6%,  0.6%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::RuleCodePrefix>::iter
     8107 (0.5%,  1.1%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule>::noqa_code
     7345 (0.4%,  1.5%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_stmt
     6412 (0.4%,  1.9%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:spanned::SpannedDeserializer<toml_edit[7e3a6c5e67260672]:🇩🇪:value::ValueDeserializer>>
     6412 (0.4%,  2.2%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:table::TableMapAccess>
     6409 (0.4%,  2.6%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_map::<toml_edit[7e3a6c5e67260672]:🇩🇪:datetime::DatetimeDeserializer>
     5696 (0.3%,  3.0%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::checkers::ast::Checker as ruff_python_ast[5588cd60041c8605]::visitor::Visitor>::visit_expr
     4448 (0.3%,  3.2%)      1 (0.0%,  0.0%)  ruff[52408f46d2058296]::flake8_to_ruff::converter::convert
     3702 (0.2%,  3.4%)      1 (0.0%,  0.0%)  <&ruff[52408f46d2058296]::registry::Linter as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
     3349 (0.2%,  3.6%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::registry::Linter>::code_for_rule
     3132 (0.2%,  3.8%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::fmt::Debug>::fmt
     3130 (0.2%,  4.0%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<&ruff[52408f46d2058296]::codes::Rule>>::from
     3130 (0.2%,  4.2%)      1 (0.0%,  0.0%)  <&str as core[da82827a87f140f9]::convert::From<ruff[52408f46d2058296]::codes::Rule>>::from
     3130 (0.2%,  4.4%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::Rule as core[da82827a87f140f9]::convert::AsRef<str>>::as_ref
     3128 (0.2%,  4.5%)      1 (0.0%,  0.0%)  <ruff[52408f46d2058296]::codes::RuleIter>::get
     2669 (0.2%,  4.7%)      1 (0.0%,  0.0%)  <<ruff[52408f46d2058296]::settings::options::Options as serde[d89b1b632568f5a3]:🇩🇪:Deserialize>::deserialize::__Visitor as serde[d89b1b632568f5a3]:🇩🇪:Visitor>::visit_seq::<toml_edit[7e3a6c5e67260672]:🇩🇪:array::ArraySeqAccess>
     2659 (0.2%,  4.9%)      1 (0.0%,  0.0%)  <&ruff[52408f46d2058296]::codes::Pylint as core[da82827a87f140f9]::iter::traits::collect::IntoIterator>::into_iter
```

I'd assume this has a positive effect both on compile time and on runtime, but i don't know the actual effect on compile times and can't really measure.

## Test plan

Check CI for any performance regressions.

This should fix #3808 if we merge it.

* clippy

* Update update_ambiguous_characters.py
2023-06-08 08:13:20 +02:00
..
flake8_to_ruff Bump version to 0.0.272 (#4948) 2023-06-08 02:17:29 +00:00
ruff Use phf for confusables to reduce llvm lines (#4926) 2023-06-08 08:13:20 +02:00
ruff_benchmark Add Formatter benchmark (#4860) 2023-06-05 21:05:42 +02:00
ruff_cache Rename ruff_python_semantic's Context struct to SemanticModel (#4565) 2023-05-22 02:35:03 +00:00
ruff_cli Bump version to 0.0.272 (#4948) 2023-06-08 02:17:29 +00:00
ruff_dev Add a ruff_textwrap crate (#4731) 2023-05-31 16:35:23 +00:00
ruff_diagnostics Use a separate fix-isolation group for every parent node (#4774) 2023-06-02 03:07:55 +00:00
ruff_formatter Rename ruff_formatter::builders::BestFitting to FormatBestFitting (#4841) 2023-06-04 00:13:51 +02:00
ruff_index Introduce ruff_index crate (#4597) 2023-05-23 17:40:35 +02:00
ruff_macros Merge registry into codes (#4651) 2023-06-02 10:33:01 +00:00
ruff_newlines Add a ruff_textwrap crate (#4731) 2023-05-31 16:35:23 +00:00
ruff_python_ast Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
ruff_python_formatter Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
ruff_python_semantic Upgrade RustPython (#4900) 2023-06-08 05:53:14 +00:00
ruff_python_stdlib Lint pyproject.toml (#4496) 2023-05-25 12:05:28 +00:00
ruff_rustpython Re-integrate RustPython parser repository (#4359) 2023-05-11 07:47:17 +00:00
ruff_testing_macros testing_macros: Add missing full feature to syn dependency (#4722) 2023-05-30 07:42:06 +00:00
ruff_textwrap Add a ruff_textwrap crate (#4731) 2023-05-31 16:35:23 +00:00
ruff_wasm Add pyflakes.extend-generics setting (#4677) 2023-06-01 22:19:37 +00:00