Summary
--
I've been noticing this failure in the formatter ecosystem check and
decided to
look into it. We fail to parse the
[notebook](https://github.com/openai/openai-cookbook/blob/main/examples/mcp/databricks_mcp_cookbook.ipynb)
because some of the `code` cells
have non-Python code in them. `ruff format` only reports one of these,
corresponding to a shell snippet, but `ruff check` emits some additional
errors
about JS code later in the file too:
```
databricks_mcp_cookbook.ipynb:cell 21:1:11: SyntaxError: Simple statements must be separated by newlines or semicolons
databricks_mcp_cookbook.ipynb:cell 21:1:19: SyntaxError: Simple statements must be separated by newlines or semicolons
databricks_mcp_cookbook.ipynb:cell 21:1:50: SyntaxError: Simple statements must be separated by newlines or semicolons
databricks_mcp_cookbook.ipynb:cell 30:4:7: SyntaxError: Simple statements must be separated by newlines or semicolons
databricks_mcp_cookbook.ipynb:cell 30:4:41: E703 Statement ends with an unnecessary semicolon
databricks_mcp_cookbook.ipynb:cell 30:5:14: SyntaxError: Expected ':', found '{'
databricks_mcp_cookbook.ipynb:cell 30:6:9: SyntaxError: Expected ',', found '{'
databricks_mcp_cookbook.ipynb:cell 30:6:25: SyntaxError: Expected ',', found '='
databricks_mcp_cookbook.ipynb:cell 30:6:46: SyntaxError: Expected ',', found ';'
databricks_mcp_cookbook.ipynb:cell 30:6:47: SyntaxError: Expected '}', found newline
databricks_mcp_cookbook.ipynb:cell 30:7:1: SyntaxError: Unexpected indentation
databricks_mcp_cookbook.ipynb:cell 30:7:13: SyntaxError: Expected ':', found 'break'
databricks_mcp_cookbook.ipynb:cell 30:7:18: E703 Statement ends with an unnecessary semicolon
databricks_mcp_cookbook.ipynb:cell 30:8:28: SyntaxError: Simple statements must be separated by newlines or semicolons
databricks_mcp_cookbook.ipynb:cell 30:8:55: E703 Statement ends with an unnecessary semicolon
databricks_mcp_cookbook.ipynb:cell 30:9:18: SyntaxError: Expected an expression
databricks_mcp_cookbook.ipynb:cell 30:10:11: SyntaxError: Expected ',', found name
databricks_mcp_cookbook.ipynb:cell 30:10:16: SyntaxError: Expected ',', found '='
databricks_mcp_cookbook.ipynb:cell 30:10:22: SyntaxError: Expected ',', found name
databricks_mcp_cookbook.ipynb:cell 30:10:24: SyntaxError: Expected ',', found ';'
databricks_mcp_cookbook.ipynb:cell 30:11:27: SyntaxError: Expected ',', found '='
databricks_mcp_cookbook.ipynb:cell 30:11:34: SyntaxError: Expected ',', found name
databricks_mcp_cookbook.ipynb:cell 30:11:48: SyntaxError: Expected ',', found ';'
databricks_mcp_cookbook.ipynb:cell 30:11:49: SyntaxError: Expected '}', found NonLogicalNewline
databricks_mcp_cookbook.ipynb:cell 30:12:1: SyntaxError: Unexpected indentation
databricks_mcp_cookbook.ipynb:cell 30:12:16: E703 Statement ends with an unnecessary semicolon
databricks_mcp_cookbook.ipynb:cell 30:13:3: SyntaxError: Expected a statement
databricks_mcp_cookbook.ipynb:cell 30:13:4: SyntaxError: Expected a statement
databricks_mcp_cookbook.ipynb:cell 30:13:5: SyntaxError: Expected a statement
databricks_mcp_cookbook.ipynb:cell 30:13:5: E703 Statement ends with an unnecessary semicolon
databricks_mcp_cookbook.ipynb:cell 30:13:6: SyntaxError: Expected a statement
databricks_mcp_cookbook.ipynb:cell 30:14:1: SyntaxError: Expected a statement
databricks_mcp_cookbook.ipynb:cell 30:14:2: SyntaxError: Expected a statement
```
Test Plan
--
This PR
Summary
--
This should resolve the formatter ecosystem errors we've been seeing
lately. https://github.com/mesonbuild/meson-python/pull/728 added the
links, which I think are intentionally broken for testing purposes.
Test Plan
--
Ecosystem check on this PR
## Summary
Follow-up to https://github.com/astral-sh/ruff/pull/12129 to remove the
`demisto/content` from ecosystem checks. The previous PR removed it from
the deprecated script which I didn't notice until recently.
## Test Plan
Ecosystem comment
## Summary
Something's up with this repo -- they added a post-checkout hook? So
let's just remove it for now. We should go through and add a new batch
of repositories some time.
It's a pretty big codebase using lots of different stuff, so a good
candidate for finding obscure problems.
I didn't look more closely which options are used (I have the feeling
`--select ALL` is not implied, since I see you adding it via
`check_options` for certain entries but not for others), the repo itself
has a pretty large ruff.toml - but assuming ecosystem just cares about
differences between base and head of a PR, `ALL` most likely makes
sense.
Adds the ability to override `ruff.toml` or `pyproject.toml` settings
per-project during ecosystem checks.
Exploring this as a fix for the `setuptools` project error.
Also useful for including Jupyter Notebooks in the ecosystem checks, see
#9293
Note the remaining `sphinx` project error is resolved in #9294
Failing due to
> error: Failed to read tests/roots/test-pycode/cp_1251_coded.py: stream
did not contain valid UTF-8
Unclear to me if ignoring is the correct response.
Closes#7239
- Refactors `scripts/check_ecosystem.py` into a new Python project at
`python/ruff-ecosystem`
- Includes
[documentation](https://github.com/astral-sh/ruff/blob/zanie/ecosystem-format/python/ruff-ecosystem/README.md)
now
- Provides a `ruff-ecosystem` CLI
- Fixes bug where `ruff check` report included "fixable" summary line
- Adds truncation to `ruff check` reports
- Otherwise we often won't see the `ruff format` reports
- The truncation uses some very simple heuristics and could be improved
in the future
- Identifies diagnostic changes that occur just because a violation's
fix available changes
- We still show the diff for the line because it's could matter _where_
this changes, but we could improve this
- Similarly, we could improve detection of diagnostic changes where just
the message changes
- Adds support for JSON ecosystem check output
- I added this primarily for development purposes
- If there are no changes, only errors while processing projects, we
display a different summary message
- When caching repositories, we now checkout the requested ref
- Adds `ruff format` reports, which format with the baseline then the
use `format --diff` to generate a report
- Runs all CI jobs when the CI workflow is changed
## Known problems
- Since we must format the project to get a baseline, the permalink line
numbers do not exactly correspond to the correct range
- This looks... hard. I tried using `git diff` and some wonky hunk
matching to recover the original line numbers but it doesn't seem worth
it. I think we should probably commit the formatted changes to a fork or
something if we want great results here. Consequently, I've just used
the start line instead of a range for now.
- I don't love the comment structure — it'd be nice, perhaps, to have
separate headings for the linter and formatter.
- However, the `pr-comment` workflow is an absolute pain to change
because it runs _separately_ from this pull request so I if I want to
make edits to it I can only test it via manual workflow dispatch.
- Lines are not printed "as we go" which means they're all held in
memory, presumably this would be a problem for large-scale ecosystem
checks
- We are encountering a hard limit with the maximum comment length
supported by GitHub. We will need to move the bulk of the report
elsewhere.
## Future work
- Update `ruff-ecosystem` to support non-default projects and
`check_ecosystem_all.py` behavior
- Remove existing ecosystem check scripts
- Add preview mode toggle (#8076)
- Add a toggle for truncation
- Add hints for quick reproduction of runs locally
- Consider parsing JSON output of Ruff instead of using regex to parse
the text output
- Links to project repositories should use the commit hash we checked
against
- When caching repositories, we should pull the latest changes for the
ref
- Sort check diffs by path and rule code only (changes in messages
should not change order)
- Update check diffs to distinguish between new violations and changes
in messages
- Add "fix" diffs
- Remove existing formatter similarity reports
- On release pull request, compare to the previous tag instead
---------
Co-authored-by: konsti <konstin@mailbox.org>