## Summary
I'll write up a more detailed description tomorrow, but in short, this
PR removes our regex-based implementation in favor of "manual" parsing.
I tried a couple different implementations. In the benchmarks below:
- `Directive/Regex` is our implementation on `main`.
- `Directive/Find` just uses `text.find("noqa")`, which is insufficient,
since it doesn't cover case-insensitive variants like `NOQA`, and
doesn't handle multiple `noqa` matches in a single like, like ` # Here's
a noqa comment # noqa: F401`. But it's kind of a baseline.
- `Directive/Memchr` uses three `memchr` iterative finders (one for
`noqa`, `NOQA`, and `NoQA`).
- `Directive/AhoCorasick` is roughly the variant checked-in here.
The raw results:
```
Directive/Regex/# noqa: F401
time: [273.69 ns 274.71 ns 276.03 ns]
change: [+1.4467% +1.8979% +2.4243%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
3 (3.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe
Directive/Find/# noqa: F401
time: [66.972 ns 67.048 ns 67.132 ns]
change: [+2.8292% +2.9377% +3.0540%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
1 (1.00%) low severe
3 (3.00%) low mild
8 (8.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa: F401
time: [76.922 ns 77.189 ns 77.536 ns]
change: [+0.4265% +0.6862% +0.9871%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Memchr/# noqa: F401
time: [62.627 ns 62.654 ns 62.679 ns]
change: [-0.1780% -0.0887% -0.0120%] (p = 0.03 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
1 (1.00%) low severe
5 (5.00%) low mild
3 (3.00%) high mild
2 (2.00%) high severe
Directive/Regex/# noqa: F401, F841
time: [321.83 ns 322.39 ns 322.93 ns]
change: [+8602.4% +8623.5% +8644.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
1 (1.00%) low severe
2 (2.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# noqa: F401, F841
time: [78.618 ns 78.758 ns 78.896 ns]
change: [+1.6909% +1.8771% +2.0628%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
3 (3.00%) high mild
Directive/AhoCorasick/# noqa: F401, F841
time: [87.739 ns 88.057 ns 88.468 ns]
change: [+0.1843% +0.4685% +0.7854%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 11 outliers among 100 measurements (11.00%)
5 (5.00%) low mild
3 (3.00%) high mild
3 (3.00%) high severe
Directive/Memchr/# noqa: F401, F841
time: [80.674 ns 80.774 ns 80.860 ns]
change: [-0.7343% -0.5633% -0.4031%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low severe
9 (9.00%) low mild
1 (1.00%) high mild
Directive/Regex/# noqa time: [194.86 ns 195.93 ns 196.97 ns]
change: [+11973% +12039% +12103%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) low mild
1 (1.00%) high mild
Directive/Find/# noqa time: [25.327 ns 25.354 ns 25.383 ns]
change: [+3.8524% +4.0267% +4.1845%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
6 (6.00%) high mild
3 (3.00%) high severe
Directive/AhoCorasick/# noqa
time: [34.267 ns 34.368 ns 34.481 ns]
change: [+0.5646% +0.8505% +1.1281%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
5 (5.00%) high mild
Directive/Memchr/# noqa time: [21.770 ns 21.818 ns 21.874 ns]
change: [-0.0990% +0.1464% +0.4046%] (p = 0.26 > 0.05)
No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
4 (4.00%) low mild
4 (4.00%) high mild
2 (2.00%) high severe
Directive/Regex/# type: ignore # noqa: E501
time: [278.76 ns 279.69 ns 280.72 ns]
change: [+7449.4% +7469.8% +7490.5%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) low mild
1 (1.00%) high mild
1 (1.00%) high severe
Directive/Find/# type: ignore # noqa: E501
time: [67.791 ns 67.976 ns 68.184 ns]
change: [+2.8321% +3.1735% +3.5418%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# type: ignore # noqa: E501
time: [75.908 ns 76.055 ns 76.210 ns]
change: [+0.9269% +1.1427% +1.3955%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high severe
Directive/Memchr/# type: ignore # noqa: E501
time: [72.549 ns 72.723 ns 72.957 ns]
change: [+1.5881% +1.9660% +2.3974%] (p = 0.00 < 0.05)
Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
10 (10.00%) high mild
5 (5.00%) high severe
Directive/Regex/# type: ignore # nosec
time: [66.967 ns 67.075 ns 67.207 ns]
change: [+1713.0% +1715.8% +1718.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low severe
3 (3.00%) low mild
2 (2.00%) high mild
4 (4.00%) high severe
Directive/Find/# type: ignore # nosec
time: [18.505 ns 18.548 ns 18.597 ns]
change: [+1.3520% +1.6976% +2.0333%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild
Directive/AhoCorasick/# type: ignore # nosec
time: [16.162 ns 16.206 ns 16.252 ns]
change: [+1.2919% +1.5587% +1.8430%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
3 (3.00%) high mild
1 (1.00%) high severe
Directive/Memchr/# type: ignore # nosec
time: [39.192 ns 39.233 ns 39.276 ns]
change: [+0.5164% +0.7456% +0.9790%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 13 outliers among 100 measurements (13.00%)
2 (2.00%) low severe
4 (4.00%) low mild
3 (3.00%) high mild
4 (4.00%) high severe
Directive/Regex/# some very long comment that # is interspersed with characters but # no directive
time: [81.460 ns 81.578 ns 81.703 ns]
change: [+2093.3% +2098.8% +2104.2%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) low mild
2 (2.00%) high mild
Directive/Find/# some very long comment that # is interspersed with characters but # no directive
time: [26.284 ns 26.331 ns 26.387 ns]
change: [+0.7554% +1.1027% +1.3832%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
5 (5.00%) high mild
1 (1.00%) high severe
Directive/AhoCorasick/# some very long comment that # is interspersed with characters but # no direc...
time: [28.643 ns 28.714 ns 28.787 ns]
change: [+1.3774% +1.6780% +2.0028%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild
Directive/Memchr/# some very long comment that # is interspersed with characters but # no directive
time: [55.766 ns 55.831 ns 55.897 ns]
change: [+1.5802% +1.7476% +1.9021%] (p = 0.00 < 0.05)
Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild
```
While memchr is faster than aho-corasick in some of the common cases
(like `# noqa: F401`), the latter is way, way faster when there _isn't_
a match (like 2x faster -- see the last two cases). Since most comments
_aren't_ `noqa` comments, this felt like the right tradeoff. Note that
all implementations are significantly faster than the regex version.
(I know I originally reported a 10x speedup, but I ended up improving
the regex version a bit in some prior PRs, so it got unintentionally
faster via some refactors.)
There's also one behavior change in here, which is that we now allow
variable spaces, e.g., `#noqa` or `# noqa`. Previously, we required
exactly one space. This thus closes #5177.
|
||
|---|---|---|
| .cargo | ||
| .devcontainer | ||
| .github | ||
| assets | ||
| crates | ||
| docs | ||
| fuzz | ||
| playground | ||
| python/ruff | ||
| scripts | ||
| .editorconfig | ||
| .gitattributes | ||
| .gitignore | ||
| .markdownlint.yaml | ||
| .pre-commit-config.yaml | ||
| _typos.toml | ||
| BREAKING_CHANGES.md | ||
| Cargo.lock | ||
| Cargo.toml | ||
| clippy.toml | ||
| CODE_OF_CONDUCT.md | ||
| CONTRIBUTING.md | ||
| foo.py | ||
| LICENSE | ||
| mkdocs.insiders.yml | ||
| mkdocs.template.yml | ||
| pyproject.toml | ||
| README.md | ||
| ruff.schema.json | ||
| rust-toolchain | ||
Ruff
Discord | Docs | Playground
An extremely fast Python linter, written in Rust.
Linting the CPython codebase from scratch.
- ⚡️ 10-100x faster than existing linters
- 🐍 Installable via
pip - 🛠️
pyproject.tomlsupport - 🤝 Python 3.11 compatibility
- 📦 Built-in caching, to avoid re-analyzing unchanged files
- 🔧 Autofix support, for automatic error correction (e.g., automatically remove unused imports)
- 📏 Over 500 built-in rules
- ⚖️ Near-parity with the built-in Flake8 rule set
- 🔌 Native re-implementations of dozens of Flake8 plugins, like flake8-bugbear
- ⌨️ First-party editor integrations for VS Code and more
- 🌎 Monorepo-friendly, with hierarchical and cascading configuration
Ruff aims to be orders of magnitude faster than alternative tools while integrating more functionality behind a single, common interface.
Ruff can be used to replace Flake8 (plus dozens of plugins), isort, pydocstyle, yesqa, eradicate, pyupgrade, and autoflake, all while executing tens or hundreds of times faster than any individual tool.
Ruff is extremely actively developed and used in major open-source projects like:
...and many more.
Ruff is backed by Astral. Read the launch post, or the original project announcement.
Testimonials
Sebastián Ramírez, creator of FastAPI:
Ruff is so fast that sometimes I add an intentional bug in the code just to confirm it's actually running and checking the code.
Nick Schrock, founder of Elementl, co-creator of GraphQL:
Why is Ruff a gamechanger? Primarily because it is nearly 1000x faster. Literally. Not a typo. On our largest module (dagster itself, 250k LOC) pylint takes about 2.5 minutes, parallelized across 4 cores on my M1. Running ruff against our entire codebase takes .4 seconds.
Bryan Van de Ven, co-creator of Bokeh, original author of Conda:
Ruff is ~150-200x faster than flake8 on my machine, scanning the whole repo takes ~0.2s instead of ~20s. This is an enormous quality of life improvement for local dev. It's fast enough that I added it as an actual commit hook, which is terrific.
Timothy Crosley, creator of isort:
Just switched my first project to Ruff. Only one downside so far: it's so fast I couldn't believe it was working till I intentionally introduced some errors.
Tim Abbott, lead developer of Zulip:
This is just ridiculously fast...
ruffis amazing.
Table of Contents
For more, see the documentation.
Getting Started
For more, see the documentation.
Installation
Ruff is available as ruff on PyPI:
pip install ruff
You can also install Ruff via Homebrew, Conda, and with a variety of other package managers.
Usage
To run Ruff, try any of the following:
ruff check . # Lint all files in the current directory (and any subdirectories)
ruff check path/to/code/ # Lint all files in `/path/to/code` (and any subdirectories)
ruff check path/to/code/*.py # Lint all `.py` files in `/path/to/code`
ruff check path/to/code/to/file.py # Lint `file.py`
Ruff can also be used as a pre-commit hook:
- repo: https://github.com/astral-sh/ruff-pre-commit
# Ruff version.
rev: v0.0.277
hooks:
- id: ruff
Ruff can also be used as a VS Code extension or alongside any other editor through the Ruff LSP.
Ruff can also be used as a GitHub Action via
ruff-action:
name: Ruff
on: [ push, pull_request ]
jobs:
ruff:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: chartboost/ruff-action@v1
Configuration
Ruff can be configured through a pyproject.toml, ruff.toml, or .ruff.toml file (see:
Configuration, or Settings
for a complete list of all configuration options).
If left unspecified, the default configuration is equivalent to:
[tool.ruff]
# Enable pycodestyle (`E`) and Pyflakes (`F`) codes by default.
select = ["E", "F"]
ignore = []
# Allow autofix for all enabled rules (when `--fix`) is provided.
fixable = ["A", "B", "C", "D", "E", "F", "G", "I", "N", "Q", "S", "T", "W", "ANN", "ARG", "BLE", "COM", "DJ", "DTZ", "EM", "ERA", "EXE", "FBT", "ICN", "INP", "ISC", "NPY", "PD", "PGH", "PIE", "PL", "PT", "PTH", "PYI", "RET", "RSE", "RUF", "SIM", "SLF", "TCH", "TID", "TRY", "UP", "YTT"]
unfixable = []
# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".mypy_cache",
".nox",
".pants.d",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"venv",
]
# Same as Black.
line-length = 88
# Allow unused variables when underscore-prefixed.
dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$"
# Assume Python 3.10.
target-version = "py310"
[tool.ruff.mccabe]
# Unlike Flake8, default to a complexity level of 10.
max-complexity = 10
Some configuration options can be provided via the command-line, such as those related to rule enablement and disablement, file discovery, logging level, and more:
ruff check path/to/code/ --select F401 --select F403 --quiet
See ruff help for more on Ruff's top-level commands, or ruff help check for more on the
linting command.
Rules
Ruff supports over 500 lint rules, many of which are inspired by popular tools like Flake8, isort, pyupgrade, and others. Regardless of the rule's origin, Ruff re-implements every rule in Rust as a first-party feature.
By default, Ruff enables Flake8's E and F rules. Ruff supports all rules from the F category,
and a subset of the E category, omitting those
stylistic rules made obsolete by the use of an autoformatter, like
Black.
If you're just getting started with Ruff, the default rule set is a great place to start: it catches a wide variety of common errors (like unused imports) with zero configuration.
Beyond the defaults, Ruff re-implements some of the most popular Flake8 plugins and related code quality tools, including:
- autoflake
- eradicate
- flake8-2020
- flake8-annotations
- flake8-async
- flake8-bandit (#1646)
- flake8-blind-except
- flake8-boolean-trap
- flake8-bugbear
- flake8-builtins
- flake8-commas
- flake8-comprehensions
- flake8-copyright
- flake8-datetimez
- flake8-debugger
- flake8-django
- flake8-docstrings
- flake8-eradicate
- flake8-errmsg
- flake8-executable
- flake8-future-annotations
- flake8-gettext
- flake8-implicit-str-concat
- flake8-import-conventions
- flake8-logging-format
- flake8-no-pep420
- flake8-pie
- flake8-print
- flake8-pyi
- flake8-pytest-style
- flake8-quotes
- flake8-raise
- flake8-return
- flake8-self
- flake8-simplify
- flake8-slots
- flake8-super
- flake8-tidy-imports
- flake8-todos
- flake8-type-checking
- flake8-use-pathlib
- flynt (#2102)
- isort
- mccabe
- pandas-vet
- pep8-naming
- pydocstyle
- pygrep-hooks
- pylint-airflow
- pyupgrade
- tryceratops
- yesqa
For a complete enumeration of the supported rules, see Rules.
Contributing
Contributions are welcome and highly appreciated. To get started, check out the contributing guidelines.
You can also join us on Discord.
Support
Having trouble? Check out the existing issues on GitHub, or feel free to open a new one.
You can also ask for help on Discord.
Acknowledgements
Ruff's linter draws on both the APIs and implementation details of many other tools in the Python ecosystem, especially Flake8, Pyflakes, pycodestyle, pydocstyle, pyupgrade, and isort.
In some cases, Ruff includes a "direct" Rust port of the corresponding tool. We're grateful to the maintainers of these tools for their work, and for all the value they've provided to the Python community.
Ruff's autoformatter is built on a fork of Rome's rome_formatter,
and again draws on both API and implementation details from Rome,
Prettier, and Black.
Ruff's import resolver is based on the import resolution algorithm from Pyright.
Ruff is also influenced by a number of tools outside the Python ecosystem, like Clippy and ESLint.
Ruff is the beneficiary of a large number of contributors.
Ruff is released under the MIT license.
Who's Using Ruff?
Ruff is used by a number of major open-source projects and companies, including:
- Amazon (AWS SAM)
- Anthropic (Python SDK)
- Apache Airflow
- AstraZeneca (Magnus)
- Benchling (Refac)
- Babel
- Bokeh
- Cryptography (PyCA)
- DVC
- Dagger
- Dagster
- Databricks (MLflow)
- FastAPI
- Gradio
- Great Expectations
- HTTPX
- Hugging Face (Transformers, Datasets, Diffusers)
- Hatch
- Home Assistant
- ING Bank (popmon, probatus)
- Ibis
- Jupyter
- LangChain
- LlamaIndex
- Matrix (Synapse)
- MegaLinter
- Meltano (Meltano CLI, Singer SDK)
- Microsoft (Semantic Kernel, ONNX Runtime, LightGBM)
- Modern Treasury (Python SDK)
- Mozilla (Firefox)
- Mypy
- Netflix (Dispatch)
- Neon
- ONNX
- OpenBB
- PDM
- PaddlePaddle
- Pandas
- Poetry
- Polars
- PostHog
- Prefect (Python SDK, Marvin)
- PyInstaller
- PyTorch
- Pydantic
- Pylint
- Pynecone
- Robyn
- Scale AI (Launch SDK)
- Snowflake (SnowCLI)
- Saleor
- SciPy
- Sphinx
- Stable Baselines3
- Litestar
- The Algorithms
- Vega-Altair
- WordPress (Openverse)
- ZenML
- Zulip
- build (PyPA)
- cibuildwheel (PyPA)
- delta-rs
- featuretools
- meson-python
- nox
- pip
Show Your Support
If you're using Ruff, consider adding the Ruff badge to project's README.md:
[](https://github.com/astral-sh/ruff)
...or README.rst:
.. image:: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json
:target: https://github.com/astral-sh/ruff
:alt: Ruff
...or, as HTML:
<a href="https://github.com/astral-sh/ruff"><img src="https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json" alt="Ruff" style="max-width:100%;"></a>
License
MIT