mirrors/ruff - Forgejo: Beyond coding. We Forge.

mirror of https://github.com/astral-sh/ruff.git synced 2025-08-26 13:24:23 +00:00

Author	SHA1	Message	Date
Charlie Marsh	65aebf127a	Treat form feed as whitespace in `SimpleTokenizer` (#7626 ) ## Summary This is whitespace as per `is_python_whitespace`, and right now it tends to lead to panics in the formatter. Seems reasonable to treat it as whitespace in the `SimpleTokenizer` too. Closes .https://github.com/astral-sh/ruff/issues/7624.	2023-09-25 14:34:59 +00:00
Charlie Marsh	17ceb5dcb3	Preserve newlines after nested compound statements (#7608 ) ## Summary Given: ```python if True: if True: pass else: pass # a # b # c else: pass ``` We want to preserve the newline after the `# c` (before the `else`). However, the `last_node` ends at the `pass`, and the comments are trailing comments on the `pass`, not trailing comments on the `last_node` (the `if`). As such, when counting the trailing newlines on the outer `if`, we abort as soon as we see the comment (`# a`). This PR changes the logic to skip _all_ comments (even those with newlines between them). This is safe as we know that there are no "leading" comments on the `else`, so there's no risk of skipping those accidentally. Closes https://github.com/astral-sh/ruff/issues/7602. ## Test Plan No change in compatibility. Before: \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1631 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99979 \| 3496 \| 22 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \| After: \| project \| similarity index \| total files \| changed files \| \|--------------\|------------------:\|------------------:\|------------------:\| \| cpython \| 0.76083 \| 1789 \| 1631 \| \| django \| 0.99983 \| 2760 \| 36 \| \| transformers \| 0.99963 \| 2587 \| 319 \| \| twine \| 1.00000 \| 33 \| 0 \| \| typeshed \| 0.99983 \| 3496 \| 18 \| \| warehouse \| 0.99967 \| 648 \| 15 \| \| zulip \| 0.99972 \| 1437 \| 21 \|	2023-09-25 14:21:44 +00:00
konsti	6dade5b9ab	Tokenizer: Emit only a single bogus token (#7425 ) Summary Instead of emitting a bogus token per char, we now only emit on single last bogus token. This leads to much more concise output. Test Plan Updated fixtures	2023-09-19 16:06:03 +02:00
Charlie Marsh	8d0a5e01bd	Modify `comment_ranges` slice in `BackwardsTokenizer` (#7432 ) ## Summary I was kinda curious to understand this issue (https://github.com/astral-sh/ruff/issues/7426) and just ended up attempting to address it. ## Test Plan `cargo test`	2023-09-16 14:04:45 -04:00
konsti	2cbe1733c8	Use CommentRanges in backwards lexing (#7360 ) ## Summary The tokenizer was split into a forward and a backwards tokenizer. The backwards tokenizer uses the same names as the forwards ones (e.g. `next_token`). The backwards tokenizer gets the comment ranges that we already built to skip comments. --------- Co-authored-by: Micha Reiser <micha@reiser.io>	2023-09-16 03:21:45 +00:00
Micha Reiser	f1a4eb9c28	Use the unicode-ident crate (#7212 )	2023-09-07 08:19:25 +00:00
Victor Hugo Gomes	041cdb95e0	Update identifier Unicode character validation to match Python spec (#7209 ) Co-authored-by: Micha Reiser <micha@reiser.io>	2023-09-07 07:08:42 +00:00
Charlie Marsh	059757a8c8	Implement `Ranged` on more structs (#6921 ) Now that it's in `ruff_text_size`, we can use it in a few places that we couldn't before.	2023-08-27 19:03:08 +00:00
Charlie Marsh	fc89976c24	Move `Ranged` into `ruff_text_size` (#6919 ) ## Summary The motivation here is that this enables us to implement `Ranged` in crates that don't depend on `ruff_python_ast`. Largely a mechanical refactor with a lot of regex, Clippy help, and manual fixups. ## Test Plan `cargo test`	2023-08-27 14:12:51 -04:00
Charlie Marsh	474e8fbcd4	Format all attribute dot comments manually (#6825 ) ## Summary This PR modifies our formatting of comments around the `.` in an attribute. Specifically, the goal here is to avoid _reordering_ comments, and the net effect is that we generally leave comments where-they-are when dealing with comments between around the dot (which you can also think of as comments between attributes). All comments around the dot are now treated as dangling and formatted manually, with the exception of end-of-line or parenthesized comments on the value, like those marked as trailing here, which remain trailing: ```python ( ( a # trailing end-of-line # trailing own-line ) # dangling before dot end-of-line .b # trailing end-of-line ) ``` Closes https://github.com/astral-sh/ruff/issues/6823. ## Test Plan `cargo test` Before: \| project \| similarity index \| \|--------------\|------------------\| \| cpython \| 0.76050 \| \| django \| 0.99820 \| \| transformers \| 0.99800 \| \| twine \| 0.99876 \| \| typeshed \| 0.99953 \| \| warehouse \| 0.99615 \| \| zulip \| 0.99729 \| After: \| project \| similarity index \| \|--------------\|------------------\| \| cpython \| 0.76050 \| \| django \| 0.99820 \| \| transformers \| 0.99800 \| \| twine \| 0.99876 \| \| typeshed \| 0.99953 \| \| warehouse \| 0.99615 \| \| zulip \| 0.99729 \|	2023-08-25 03:50:56 +00:00
Charlie Marsh	847432cacf	Avoid attempting to fix PT018 in multi-statement lines (#6829 ) ## Summary These fixes will _always_ fail, so we should avoid trying to construct them in the first place. Closes https://github.com/astral-sh/ruff/issues/6812.	2023-08-23 19:09:34 -04:00
Charlie Marsh	86ccdcc9d9	Add support for multi-character operator tokens to `SimpleTokenizer` (#6563 ) ## Summary Allows for proper lexing of tokens like `->`. The main challenge is to ensure that our forward and backwards representations are the same for cases like `===`. Specifically, we want that to lex as `==` followed by `=` regardless of whether it's a forwards or backwards lex. To do so, we identify the range of the sequential characters (the full span of `===`), lex it forwards, then return the last token. ## Test Plan `cargo test`	2023-08-16 09:09:19 -04:00
Micha Reiser	e28858bb29	Fast path for ASCII only identifiers start (#6609 )	2023-08-16 10:22:44 +02:00
Charlie Marsh	a3d4f08f29	Add general support for parenthesized comments on expressions (#6485 ) ## Summary This PR adds support for parenthesized comments. A parenthesized comment is a comment that appears within a parenthesis, but not within the range of the expression enclosed by the parenthesis. For example, the comment here is a parenthesized comment: ```python if ( # comment True ): ... ``` The parentheses enclose the `True`, but the range of `True` doesn’t include the `# comment`. There are at least two problems associated with parenthesized comments: (1) associating the comment with the correct (i.e., enclosed) node; and (2) formatting the comment correctly, once it has been associated with the enclosed node. The solution proposed here for (1) is to search for parentheses between preceding and following node, and use open and close parentheses to break ties, rather than always assigning to the preceding node. For (2), we handle these special parenthesized comments in `FormatExpr`. The biggest risk with this approach is that we forget some codepath that force-disables parenthesization (by passing in `Parentheses::Never`). I've audited all usages of that enum and added additional handling + test coverage for such cases. Closes https://github.com/astral-sh/ruff/issues/6390. ## Test Plan `cargo test` with new cases. Before: \| project \| similarity index \| \|--------------\|------------------\| \| build \| 0.75623 \| \| cpython \| 0.75472 \| \| django \| 0.99804 \| \| transformers \| 0.99618 \| \| typeshed \| 0.74233 \| \| warehouse \| 0.99601 \| \| zulip \| 0.99727 \| After: \| project \| similarity index \| \|--------------\|------------------\| \| build \| 0.75623 \| \| cpython \| 0.75472 \| \| django \| 0.99804 \| \| transformers \| 0.99618 \| \| typeshed \| 0.74237 \| \| warehouse \| 0.99601 \| \| zulip \| 0.99727 \|	2023-08-15 18:59:18 +00:00
Charlie Marsh	7f7df852e8	Remove some extraneous newlines in Cargo.toml (#6577 )	2023-08-14 23:39:41 +00:00
Charlie Marsh	3711f8ad59	Expand `SimpleTokenizer` to all keywords and single-character tokens (#6518 ) ## Summary For #6485, I need to be able to use the `SimpleTokenizer` to lex the space between any two adjacent expressions (i.e., the space between a preceding and following node). This requires that we support a wider range of keywords (like `and`, to connect the pieces of `x and y`), and some additional single-character tokens (like `-` and `>`, to support `->`). Note that the `SimpleTokenizer` does not support multi-character tokens, so the `->` in a function signature is lexed as a `-` followed by a `>` -- but this is fine for our purposes.	2023-08-14 10:35:31 -04:00
Micha Reiser	9584f613b9	Remove `allow(pedantic)` from formatter (#6549 )	2023-08-14 14:02:06 +02:00
Charlie Marsh	1e3fe67ca5	Refactor and rename `skip_trailing_trivia` (#6312 ) Based on feedback here: https://github.com/astral-sh/ruff/pull/6274#discussion_r1282747964.	2023-08-04 13:30:53 +00:00
Micha Reiser	debfca3a11	Remove `Parse` trait (#6235 )	2023-08-01 18:35:03 +02:00
Charlie Marsh	646ff6497c	Ignore end-of-line file exemption comments (#6160 ) ## Summary This PR protects against code like: ```python from typing import Optional import bar # ruff: noqa import baz class Foo: x: Optional[str] = None ``` In which the user wrote `# ruff: noqa` to ignore a specific error, not realizing that it was a file-level exemption that thus turned off all lint rules. Specifically, if a `# ruff: noqa` directive is not at the start of a line, we now ignore it and warn, since this is almost certainly a mistake.	2023-07-29 00:40:32 +00:00
Micha Reiser	40f54375cb	Pull in RustPython parser (#6099 )	2023-07-27 09:29:11 +00:00
Micha Reiser	2cf00fee96	Remove parser dependency from ruff-python-ast (#6096 )	2023-07-26 17:47:22 +02:00
Micha Reiser	d351761f5d	`SimpleTokenizer`: Fix infinite loop when lexing empty quotes (#5917 )	2023-07-20 15:18:35 +02:00
Micha Reiser	76e9ce6dc0	Fix `SimpleTokenizer`'s backward lexing of `#` (#5878 )	2023-07-20 11:54:18 +02:00
Charlie Marsh	5f3da9955a	Rename `ruff_python_whitespace` to `ruff_python_trivia` (#5886 ) ## Summary This crate now contains utilities for dealing with trivia more broadly: whitespace, newlines, "simple" trivia lexing, etc. So renaming it to reflect its increased responsibilities. To avoid conflicts, I've also renamed `Token` and `TokenKind` to `SimpleToken` and `SimpleTokenKind`.	2023-07-19 11:48:27 -04:00

25 commits