Commit graph

476 commits

Author SHA1 Message Date
konsti
3ccd1d580d
Use crates.io unicode_names2 0.6.0 (#6478)
Update `unicode_names2` to the crates.io release 0.6.0, removing a git
dependency.
2023-10-02 18:17:38 -04:00
Charlie Marsh
75f759ed55
Upgrade LibCST to support Python 3.12 (#7764)
## Summary

We'll revert back to the crates.io release once it's up-to-date, but
better to get this out now that Python 3.12 is released.

## Test Plan

`cargo test`
2023-10-02 12:14:35 -04:00
dependabot[bot]
0df27375ba
Bump memchr from 2.6.3 to 2.6.4 (#7758) 2023-10-02 09:50:22 -04:00
dependabot[bot]
c82d0503a8
Bump thiserror from 1.0.48 to 1.0.49 (#7757) 2023-10-02 09:50:06 -04:00
dependabot[bot]
9ba5bc26f6
Bump insta from 1.32.0 to 1.33.0 (#7754) 2023-10-02 09:49:43 -04:00
dependabot[bot]
2cb5e43dd7
Bump clap from 4.4.4 to 4.4.5 (#7666)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-26 13:02:16 +02:00
Charlie Marsh
93b5d8a0fb
Implement our own small-integer optimization (#7584)
## Summary

This is a follow-up to #7469 that attempts to achieve similar gains, but
without introducing malachite. Instead, this PR removes the `BigInt`
type altogether, instead opting for a simple enum that allows us to
store small integers directly and only allocate for values greater than
`i64`:

```rust
/// A Python integer literal. Represents both small (fits in an `i64`) and large integers.
#[derive(Clone, PartialEq, Eq, Hash)]
pub struct Int(Number);

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub enum Number {
    /// A "small" number that can be represented as an `i64`.
    Small(i64),
    /// A "large" number that cannot be represented as an `i64`.
    Big(Box<str>),
}

impl std::fmt::Display for Number {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match self {
            Number::Small(value) => write!(f, "{value}"),
            Number::Big(value) => write!(f, "{value}"),
        }
    }
}
```

We typically don't care about numbers greater than `isize` -- our only
uses are comparisons against small constants (like `1`, `2`, `3`, etc.),
so there's no real loss of information, except in one or two rules where
we're now a little more conservative (with the worst-case being that we
don't flag, e.g., an `itertools.pairwise` that uses an extremely large
value for the slice start constant). For simplicity, a few diagnostics
now show a dedicated message when they see integers that are out of the
supported range (e.g., `outdated-version-block`).

An additional benefit here is that we get to remove a few dependencies,
especially `num-bigint`.

## Test Plan

`cargo test`
2023-09-25 15:13:21 +00:00
dependabot[bot]
ab643017f9
Bump smallvec from 1.11.0 to 1.11.1 (#7563)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-21 11:53:27 +02:00
dependabot[bot]
6eb9a9a633
Bump insta from 1.31.0 to 1.32.0 (#7565)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-21 11:53:14 +02:00
dependabot[bot]
c43542896f
Bump unicode-width from 0.1.10 to 0.1.11 (#7534)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-20 10:26:54 +02:00
konsti
ef34c5cbec
Update itertools to 0.11 (#7513)
Preparation for #7469.

Changelog:
https://github.com/rust-itertools/itertools/blob/master/CHANGELOG.md#0110
2023-09-19 09:53:14 +00:00
dependabot[bot]
fdbefd777c
Bump syn from 2.0.33 to 2.0.37 (#7512)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-19 11:08:56 +02:00
dependabot[bot]
078547adbb
Bump clap from 4.4.3 to 4.4.4 (#7511)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-19 11:08:39 +02:00
dependabot[bot]
42a0bec146
Bump schemars from 0.8.13 to 0.8.15 (#7510)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-19 11:08:22 +02:00
dependabot[bot]
a902d14c31
Bump chrono from 0.4.30 to 0.4.31 (#7481) 2023-09-18 13:11:51 -04:00
dependabot[bot]
70ea49bf72
Bump test-case from 3.1.0 to 3.2.1 (#7484)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-18 11:31:41 +02:00
dependabot[bot]
8e255974bc
Bump unicode-ident from 1.0.11 to 1.0.12 (#7482)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-18 11:31:06 +02:00
dependabot[bot]
d358604464
Bump serde_json from 1.0.106 to 1.0.107 (#7480)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-18 11:30:06 +02:00
Charlie Marsh
9b43162cc4
Move documentation to docs.astral.sh/ruff (#7419)
## Summary

We're planning to move the documentation from
[https://beta.ruff.rs/docs](https://beta.ruff.rs/docs) to
[https://docs.astral.sh/ruff](https://docs.astral.sh/ruff), for a few
reasons:

1. We want to remove the `beta` from the domain, as Ruff is no longer
considered beta software.
2. We want to migrate to a structure that could accommodate multiple
future tools living under one domain.

The docs are actually already live at
[https://docs.astral.sh/ruff](https://docs.astral.sh/ruff), but later
today, I'll add a permanent redirect from the previous to the new
domain. **All existing links will continue to work, now and in
perpetuity.**

This PR contains the code changes necessary for the updated
documentation. As part of this effort, I moved the playground and
documentation from my personal Cloudflare account to our team Cloudflare
account (hence the new `--project-name` references). After merging, I'll
also update the secrets on this repo.
2023-09-15 22:49:42 -04:00
dependabot[bot]
f936d319cc
Bump toml from 0.7.6 to 0.7.8 (#7405)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-15 09:14:58 +00:00
dependabot[bot]
85d8b6228f
Bump chrono from 0.4.28 to 0.4.30 (#7406)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-15 09:00:17 +00:00
dependabot[bot]
7594dadc1d
Bump syn from 2.0.29 to 2.0.33 (#7404)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-15 08:49:52 +00:00
dependabot[bot]
2ddea7c657
Bump regex from 1.9.4 to 1.9.5 (#7384)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-14 14:58:24 +00:00
dependabot[bot]
1f8e2b8f14
Bump proc-macro2 from 1.0.66 to 1.0.67 (#7383)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-14 09:49:57 +00:00
dependabot[bot]
6a9b8aede1
Bump thiserror from 1.0.47 to 1.0.48 (#7385)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-14 09:21:47 +00:00
dependabot[bot]
f48126ad00
Bump clap from 4.4.1 to 4.4.3 (#7382)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-14 09:13:12 +00:00
Micha Reiser
4f26002dd5
chore: Upgrade strum (#7337)
## Summary

This PR upgrades `strum` from 0.24.x to 0.25.x. 

The breaking changes are: 
* strum macros now uses syn2
* The `to_string` behavior changed when using `default`. I did a quick search, we aren't using `strum(default)` 


`strum` now has a `#[derive(EnumIs)]` macro that generates `is_` methods. 

## Test Plan

cargo test
2023-09-13 18:33:27 +02:00
dependabot[bot]
d1a9c198e3
Bump path-absolutize from 3.1.0 to 3.1.1 (#7345)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-13 16:21:51 +00:00
dependabot[bot]
3fb5418c2c
Bump memchr from 2.6.2 to 2.6.3 (#7343)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-13 18:14:42 +02:00
dependabot[bot]
9fcc009a0c
Bump serde_json from 1.0.105 to 1.0.106 (#7342)
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-09-13 18:13:24 +02:00
Micha Reiser
7531bb3b21
Upgrade is-macros to 0.3.0 (#7334) 2023-09-13 15:27:40 +02:00
Micha Reiser
e376c3ff7e
Split implicit concatenated strings before binary expressions (#7145) 2023-09-08 06:51:26 +00:00
Micha Reiser
f1a4eb9c28
Use the unicode-ident crate (#7212) 2023-09-07 08:19:25 +00:00
Victor Hugo Gomes
041cdb95e0
Update identifier Unicode character validation to match Python spec (#7209)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-09-07 07:08:42 +00:00
Manuel Martinez
2e58ad437e
build: add libcst from crates (#7179)
Co-authored-by: Micha Reiser <micha@reiser.io>
2023-09-06 08:24:28 +00:00
Dhruv Manilawala
4eaa412370
Update LibCST (#7062)
## Summary

This PR updates the revision of `LibCST` dependency to 9c263aa897
inorder to fix https://github.com/astral-sh/ruff/issues/4899

## Test Plan

The test case including the carriage return (`\r`) character was added for
`F504` and then `cargo test`.

fixes: #4899
2023-09-03 09:11:24 +05:30
konsti
0b6dab5e3f
Add jupyter notebook cell ids in 4.5+ if missing (#6853)
**Summary** See
https://github.com/astral-sh/ruff/issues/6834#issuecomment-1691202417

**Test Plan** Added a new notebook
2023-08-25 08:34:42 +00:00
Micha Reiser
232b44a8ca
Indent statements in suppressed ranges (#6507) 2023-08-15 08:00:35 +02:00
Zanie Blue
9ae498595c
Upgrade Rust to 1.71 (#6323)
Addresses
[CVE-2023-38497](https://blog.rust-lang.org/2023/08/03/cve-2023-38497.html)

See also the [version release
post](https://blog.rust-lang.org/2023/08/03/Rust-1.71.1.html)
2023-08-03 21:08:39 -04:00
konsti
0fddb31235
Use tracing for format_dev (#6177)
## Summary

[tracing](https://github.com/tokio-rs/tracing) is library for logging,
tracing and related features that has a large ecosystem. Using
[tracing-subscriber](https://docs.rs/tracing-subscriber) and
[tracing-indicatif](https://github.com/emersonford/tracing-indicatif),
we get a nice logging output that you can configure with `RUST_LOG`
(e.g. `RUST_LOG=debug`) and a live look into the formatter progress.

Default:
![Screenshot from 2023-07-30
13-59-53](6432f835-9ff1-4771-955b-398e54c406dc)

`RUST_LOG=debug`:
![Screenshot from 2023-07-30
14-01-32](5f2c87da-0867-4159-82e7-b5757eebb8eb)

It's easy to see in this output which files take a disproportionate
amount of time.

[Peek 2023-07-30
14-35.webm](2c92db5c-1354-465b-a6bc-ddfb281d6f9d)

It opens up further integration with the tracing ecosystem,
[tracing-timing](https://docs.rs/tracing-timing/latest/tracing_timing/)
and [tokio-console](https://github.com/tokio-rs/console) can e.g. show
histograms and the json output allows us building better pipelines than
grepping a log file.

One caveat is using `parent: None` for the logging statements because
tracing subscriber does not allow deactivating the span without
reimplementing all the other log message formatting, too, and we don't
need span information, esp. since it would currently show the progress
bar span.

## Test Plan

n/a
2023-07-31 19:14:01 +00:00
Micha Reiser
40f54375cb
Pull in RustPython parser (#6099) 2023-07-27 09:29:11 +00:00
Micha Reiser
16e1737d1b
Use cursor based lexer (#6012) 2023-07-26 11:32:26 +02:00
Dhruv Manilawala
025fa4eba8
Integrate the new Jupyter AST nodes in Ruff (#6086)
## Summary

This PR adds the implementation for the new Jupyter AST nodes i.e.,
`ExprLineMagic` and `StmtLineMagic`.

## Test Plan

Add test cases for `unparse` containing magic commands

resolves: #6087
2023-07-26 08:20:30 +00:00
Charlie Marsh
ed72c027a3
Replace NoHashHasher usages with FxHashMap (#6049)
## Summary

I had always assumed that `NoHashHasher` would be faster when using
integer keys, but benchmarking shows otherwise:

```
linter/default-rules/numpy/globals.py
                        time:   [66.544 µs 66.606 µs 66.678 µs]
                        thrpt:  [44.253 MiB/s 44.300 MiB/s 44.342 MiB/s]
                 change:
                        time:   [-0.1843% +0.1087% +0.3718%] (p = 0.46 > 0.05)
                        thrpt:  [-0.3704% -0.1086% +0.1847%]
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
linter/default-rules/pydantic/types.py
                        time:   [1.3787 ms 1.3811 ms 1.3837 ms]
                        thrpt:  [18.431 MiB/s 18.466 MiB/s 18.498 MiB/s]
                 change:
                        time:   [-0.4827% -0.1074% +0.1927%] (p = 0.56 > 0.05)
                        thrpt:  [-0.1924% +0.1075% +0.4850%]
                        No change in performance detected.
linter/default-rules/numpy/ctypeslib.py
                        time:   [624.82 µs 625.96 µs 627.17 µs]
                        thrpt:  [26.550 MiB/s 26.601 MiB/s 26.650 MiB/s]
                 change:
                        time:   [-0.7071% -0.4908% -0.2736%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2744% +0.4932% +0.7122%]
                        Change within noise threshold.
linter/default-rules/large/dataset.py
                        time:   [3.1585 ms 3.1634 ms 3.1685 ms]
                        thrpt:  [12.840 MiB/s 12.861 MiB/s 12.880 MiB/s]
                 change:
                        time:   [-1.5338% -1.3463% -1.1476%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1610% +1.3647% +1.5577%]
                        Performance has improved.

linter/all-rules/numpy/globals.py
                        time:   [140.17 µs 140.37 µs 140.58 µs]
                        thrpt:  [20.989 MiB/s 21.020 MiB/s 21.051 MiB/s]
                 change:
                        time:   [-0.1066% +0.3140% +0.7479%] (p = 0.14 > 0.05)
                        thrpt:  [-0.7423% -0.3130% +0.1067%]
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
linter/all-rules/pydantic/types.py
                        time:   [2.7030 ms 2.7069 ms 2.7112 ms]
                        thrpt:  [9.4064 MiB/s 9.4216 MiB/s 9.4351 MiB/s]
                 change:
                        time:   [-0.6721% -0.4874% -0.2974%] (p = 0.00 < 0.05)
                        thrpt:  [+0.2982% +0.4898% +0.6766%]
                        Change within noise threshold.
Found 14 outliers among 100 measurements (14.00%)
  12 (12.00%) high mild
  2 (2.00%) high severe
linter/all-rules/numpy/ctypeslib.py
                        time:   [1.4709 ms 1.4727 ms 1.4749 ms]
                        thrpt:  [11.290 MiB/s 11.306 MiB/s 11.320 MiB/s]
                 change:
                        time:   [-1.1617% -0.9766% -0.8094%] (p = 0.00 < 0.05)
                        thrpt:  [+0.8160% +0.9862% +1.1754%]
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe
linter/all-rules/large/dataset.py
                        time:   [5.8086 ms 5.8163 ms 5.8240 ms]
                        thrpt:  [6.9854 MiB/s 6.9946 MiB/s 7.0038 MiB/s]
                 change:
                        time:   [-1.5651% -1.3536% -1.1584%] (p = 0.00 < 0.05)
                        thrpt:  [+1.1720% +1.3721% +1.5900%]
                        Performance has improved.
```

My guess is that `NoHashHasher` underperforms because the keys are not
randomly distributed...

Anyway, it's a ~1% (significant) performance gain on some of the above,
plus we get to remove a dependency.
2023-07-24 23:41:57 +00:00
konsti
196cc9b655
Fix RustPython rev to main branch (#5950)
**Summary** I accidentally merged earlier while the RustPython parser
rev was still pointing to the feature branch instead of to the merged
main. This make the rev point to the RustPython parser repo main again
2023-07-21 15:53:14 +00:00
konsti
972f9a9c15
Fix formatting lambda with empty arguments (#5944)
**Summary** Fix implemented in
https://github.com/astral-sh/RustPython-Parser/pull/35: Previously,
empty lambda arguments (e.g. `lambda: 1`) would get the range of the
entire expression, which leads to incorrect comment placement. Now empty
lambda arguments get an empty range between the `lambda` and the `:`
tokens.

**Test Plan** Added a regression test.

149 instances of unstable formatting remaining.

```
$ cargo run --bin ruff_dev --release -- format-dev --stability-check --error-file formatter-ecosystem-errors.txt --multi-project target/checkouts > formatter-ecosystem-progress.txt
$ rg "Unstable formatting" target/formatter-ecosystem-errors.txt | wc -l
149
```
2023-07-21 15:48:45 +02:00
konsti
92f471a666
Handle io errors gracefully (#5611)
## Summary

It can happen that we can't read a file (a python file, a jupyter
notebook or pyproject.toml), which needs to be handled and handled
consistently for all file types. Instead of using `Err` or `error!`, we
emit E602 with the io error as message and continue. This PR makes sure
we handle all three cases consistently, emit E602.

I'm not convinced that it should be possible to disable io errors, but
we now handle the regular case consistently and at least print warning
consistently.

I went with `warn!` but i can change them all to `error!`, too.

It also checks the error case when a pyproject.toml is not readable. The
error message is not very helpful, but it's now a bit clearer that
actually ruff itself failed instead vs this being a diagnostic.

## Examples

This is how an Err of `run` looks now:


![image](890f7ab2-2309-4b6f-a4b3-67161947cc83)

With an unreadable file and `IOError` disabled:


![image](fd3d6959-fa23-4ddf-b2e5-8d6022df54b1)

(we lint zero files but count files before linting not during so we exit
0)

I'm not sure if it should (or if we should take a different path with
manual ExitStatus), but this currently also triggers when `files` is
empty:


![image](f7ede301-41b5-4743-97fd-49149f750337)

## Test Plan

Unix only: Create a temporary directory with files with permissions
`000` (not readable by the owner) and run on that directory. Since this
breaks the assumptions of most of the test code (single file, `ruff`
instead of `ruff_cli`), the test code is rather cumbersome and looks a
bit misplaced; i'm happy about suggestions to fit it in closer with the
other tests or streamline it in other ways. I added another test for
when the entire directory is not readable.
2023-07-20 11:30:14 +02:00
Micha Reiser
cda90d071c
Upgrade cargo insta (#5872) 2023-07-19 12:56:32 +02:00
Zanie Blue
f47443014e
Remove tag information from RustPython-Parser dependency (#5861)
Following https://github.com/astral-sh/RustPython-Parser/pull/27 we now
cherry-pick commits onto our fork instead of rebasing our fork on top of
the upstream which means we do not overwrite history and a tag is not
necessary to preserve the pinned commit.

In the future, we may rewrite the history in our fork. If we do, we
should return to tagging the commits.
2023-07-18 10:48:51 -05:00
konsti
730e6b2b4c
Refactor StmtIf: Formatter and Linter (#5459)
## Summary

Previously, `StmtIf` was defined recursively as
```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub orelse: Vec<Stmt>,
}
```
Every `elif` was represented as an `orelse` with a single `StmtIf`. This
means that this representation couldn't differentiate between
```python
if cond1:
    x = 1
else:
    if cond2:
        x = 2
```
and 
```python
if cond1:
    x = 1
elif cond2:
    x = 2
```
It also makes many checks harder than they need to be because we have to
recurse just to iterate over an entire if-elif-else and because we're
lacking nodes and ranges on the `elif` and `else` branches.

We change the representation to a flat

```rust
pub struct StmtIf {
    pub range: TextRange,
    pub test: Box<Expr>,
    pub body: Vec<Stmt>,
    pub elif_else_clauses: Vec<ElifElseClause>,
}

pub struct ElifElseClause {
    pub range: TextRange,
    pub test: Option<Expr>,
    pub body: Vec<Stmt>,
}
```
where `test: Some(_)` represents an `elif` and `test: None` an else.

This representation is different tradeoff, e.g. we need to allocate the
`Vec<ElifElseClause>`, the `elif`s are now different than the `if`s
(which matters in rules where want to check both `if`s and `elif`s) and
the type system doesn't guarantee that the `test: None` else is actually
last. We're also now a bit more inconsistent since all other `else`,
those from `for`, `while` and `try`, still don't have nodes. With the
new representation some things became easier, e.g. finding the `elif`
token (we can use the start of the `ElifElseClause`) and formatting
comments for if-elif-else (no more dangling comments splitting, we only
have to insert the dangling comment after the colon manually and set
`leading_alternate_branch_comments`, everything else is taken of by
having nodes for each branch and the usual placement.rs fixups).

## Merge Plan

This PR requires coordination between the parser repo and the main ruff
repo. I've split the ruff part, into two stacked PRs which have to be
merged together (only the second one fixes all tests), the first for the
formatter to be reviewed by @michareiser and the second for the linter
to be reviewed by @charliermarsh.

* MH: Review and merge
https://github.com/astral-sh/RustPython-Parser/pull/20
* MH: Review and merge or move later in stack
https://github.com/astral-sh/RustPython-Parser/pull/21
* MH: Review and approve
https://github.com/astral-sh/RustPython-Parser/pull/22
* MH: Review and approve formatter PR
https://github.com/astral-sh/ruff/pull/5459
* CM: Review and approve linter PR
https://github.com/astral-sh/ruff/pull/5460
* Merge linter PR in formatter PR, fix ecosystem checks (ecosystem
checks can't run on the formatter PR and won't run on the linter PR, so
we need to merge them first)
 * Merge https://github.com/astral-sh/RustPython-Parser/pull/22
 * Create tag in the parser, update linter+formatter PR
 * Merge linter+formatter PR https://github.com/astral-sh/ruff/pull/5459

---------

Co-authored-by: Micha Reiser <micha@reiser.io>
2023-07-18 13:40:15 +02:00