Avoid indexing the workspace for single-file mode (#13770)

## Summary

This PR updates the language server to avoid indexing the workspace for
single-file mode.

**What's a single-file mode?**

When a user opens the file directly in an editor, and not the folder
that represents the workspace, the editor usually can't determine the
workspace root. This means that during initializing the server, the
`workspaceFolders` field will be empty / nil.

Now, in this case, the server defaults to using the current working
directory which is a reasonable default assuming that the directory
would point to the one where this open file is present. This would allow
the server to index the directory itself for any config file, if
present.

It turns out that in VS Code the current working directory in the above
scenario is the system root directory `/` and so the server will try to
index the entire root directory which would take a lot of time. This is
the issue as described in
https://github.com/astral-sh/ruff-vscode/issues/627. To reproduce, refer
https://github.com/astral-sh/ruff-vscode/issues/627#issuecomment-2401440767.

This PR updates the indexer to avoid traversing the workspace to read
any config file that might be present. The first commit
(8dd2a31eef)
refactors the initialization and introduces two structs `Workspaces` and
`Workspace`. The latter struct includes a field to determine whether
it's the default workspace. The second commit
(61fc39bdb6)
utilizes this field to avoid traversing.

Closes: #11366

## Editor behavior

This is to document the behavior as seen in different editors. The test
scenario used has the following directory tree structure:
```
.
├── nested
│   ├── nested.py
│   └── pyproject.toml
└── test.py
```

where, the contents of the files are:

**test.py**
```py
import os
```

**nested/nested.py**
```py
import os
import math
```

**nested/pyproject.toml**
```toml
[tool.ruff.lint]
select = ["I"]
```

Steps:
1. Open `test.py` directly in the editor
2. Validate that it raises the `F401` violation
3. Open `nested/nested.py` in the same editor instance
4. This file would raise only `I001` if the `nested/pyproject.toml` was
indexed

### VS Code

When (1) is done from above, the current working directory is `/` which
means the server will try to index the entire system to build up the
settings index. This will include the `nested/pyproject.toml` file as
well. This leads to bad user experience because the user would need to
wait for minutes for the server to finish indexing.

This PR avoids that by not traversing the workspace directory in
single-file mode. But, in VS Code, this means that per (4), the file
wouldn't raise `I001` but only raise two `F401` violations because the
`nested/pyproject.toml` was never resolved.

One solution here would be to fix this in the extension itself where we
would detect this scenario and pass in the workspace directory that is
the one containing this open file in (1) above.

### Neovim

**tl;dr** it works as expected because the client considers the presence
of certain files (depending on the server) as the root of the workspace.
For Ruff, they are `pyproject.toml`, `ruff.toml`, and `.ruff.toml`. This
means that the client notifies us as the user moves between single-file
mode and workspace mode.

https://github.com/astral-sh/ruff/pull/13770#issuecomment-2416608055

### Helix

Same as Neovim, additional context in
https://github.com/astral-sh/ruff/pull/13770#issuecomment-2417362097

### Sublime Text

**tl;dr** It works similar to VS Code except that the current working
directory of the current process is different and thus the config file
is never read. So, the behavior remains unchanged with this PR.

https://github.com/astral-sh/ruff/pull/13770#issuecomment-2417362097

### Zed

Zed seems to be starting a separate language server instance for each
file when the editor is running in a single-file mode even though all
files have been opened in a single editor instance.

(Separated the logs into sections separated by a single blank line
indicating 3 different server instances that the editor started for 3
files.)

```
   0.000053375s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp, using default settings
   0.009448792s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp
   0.009906334s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/test.py
   0.011775917s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered

   0.000060583s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
   0.010387125s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
   0.011061875s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/nested.py
   0.011545208s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered

   0.000059125s  INFO main ruff_server::server: No workspace settings found for file:///Users/dhruv/projects/ruff-temp/nested, using default settings
   0.010857583s  INFO main ruff_server::session::index: Registering workspace: /Users/dhruv/projects/ruff-temp/nested
   0.011428958s DEBUG ruff:main ruff_server::resolve: Included path via `include`: /Users/dhruv/projects/ruff-temp/nested/other.py
   0.011893792s  INFO ruff:main ruff_server::server: Configuration file watcher successfully registered
```

## Test Plan

When using the `ruff` server from this PR, we see that the server starts
quickly as seen in the logs. Next, when I switch to the release binary,
it starts indexing the root directory.

For more details, refer to the "Editor Behavior" section above.
This commit is contained in:
Dhruv Manilawala 2024-10-18 10:51:43 +05:30 committed by GitHub
parent 3d0bdb426a
commit 040a591cad
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 211 additions and 53 deletions

View file

@ -10,6 +10,7 @@ use rustc_hash::FxHashMap;
pub(crate) use ruff_settings::RuffSettings;
use crate::edit::LanguageId;
use crate::server::{Workspace, Workspaces};
use crate::{
edit::{DocumentKey, DocumentVersion, NotebookDocument},
PositionEncoding, TextDocument,
@ -67,12 +68,12 @@ pub enum DocumentQuery {
impl Index {
pub(super) fn new(
workspace_folders: Vec<(Url, ClientSettings)>,
workspaces: &Workspaces,
global_settings: &ClientSettings,
) -> crate::Result<Self> {
let mut settings = WorkspaceSettingsIndex::default();
for (url, workspace_settings) in workspace_folders {
settings.register_workspace(&url, Some(workspace_settings), global_settings)?;
for workspace in &**workspaces {
settings.register_workspace(workspace, global_settings)?;
}
Ok(Self {
@ -167,11 +168,12 @@ impl Index {
pub(super) fn open_workspace_folder(
&mut self,
url: &Url,
url: Url,
global_settings: &ClientSettings,
) -> crate::Result<()> {
// TODO(jane): Find a way for workspace client settings to be added or changed dynamically.
self.settings.register_workspace(url, None, global_settings)
self.settings
.register_workspace(&Workspace::new(url), global_settings)
}
pub(super) fn num_documents(&self) -> usize {
@ -284,6 +286,7 @@ impl Index {
settings.ruff_settings = ruff_settings::RuffSettingsIndex::new(
root,
settings.client_settings.editor_settings(),
false,
);
}
}
@ -398,10 +401,10 @@ impl WorkspaceSettingsIndex {
/// workspace. Otherwise, the global settings are used exclusively.
fn register_workspace(
&mut self,
workspace_url: &Url,
workspace_settings: Option<ClientSettings>,
workspace: &Workspace,
global_settings: &ClientSettings,
) -> crate::Result<()> {
let workspace_url = workspace.url();
if workspace_url.scheme() != "file" {
tracing::info!("Ignoring non-file workspace URL: {workspace_url}");
show_warn_msg!("Ruff does not support non-file workspaces; Ignoring {workspace_url}");
@ -411,8 +414,8 @@ impl WorkspaceSettingsIndex {
anyhow!("Failed to convert workspace URL to file path: {workspace_url}")
})?;
let client_settings = if let Some(workspace_settings) = workspace_settings {
ResolvedClientSettings::with_workspace(&workspace_settings, global_settings)
let client_settings = if let Some(workspace_settings) = workspace.settings() {
ResolvedClientSettings::with_workspace(workspace_settings, global_settings)
} else {
ResolvedClientSettings::global(global_settings)
};
@ -420,8 +423,10 @@ impl WorkspaceSettingsIndex {
let workspace_settings_index = ruff_settings::RuffSettingsIndex::new(
&workspace_path,
client_settings.editor_settings(),
workspace.is_default(),
);
tracing::info!("Registering workspace: {}", workspace_path.display());
self.insert(
workspace_path,
WorkspaceSettings {

View file

@ -100,13 +100,33 @@ impl RuffSettings {
}
impl RuffSettingsIndex {
pub(super) fn new(root: &Path, editor_settings: &ResolvedEditorSettings) -> Self {
/// Create the settings index for the given workspace root.
///
/// This will create the index in the following order:
/// 1. Resolve any settings from above the workspace root
/// 2. Resolve any settings from the workspace root itself
/// 3. Resolve any settings from within the workspace directory tree
///
/// If this is the default workspace i.e., the client did not specify any workspace and so the
/// server will be running in a single file mode, then only (1) and (2) will be resolved,
/// skipping (3).
pub(super) fn new(
root: &Path,
editor_settings: &ResolvedEditorSettings,
is_default_workspace: bool,
) -> Self {
let mut has_error = false;
let mut index = BTreeMap::default();
let mut respect_gitignore = None;
// Add any settings from above the workspace root, excluding the workspace root itself.
for directory in root.ancestors().skip(1) {
// If this is *not* the default workspace, then we should skip the workspace root itself
// because it will be resolved when walking the workspace directory tree. This is done by
// the `WalkBuilder` below.
let should_skip_workspace = usize::from(!is_default_workspace);
// Add any settings from above the workspace root, skipping the workspace root itself if
// this is *not* the default workspace.
for directory in root.ancestors().skip(should_skip_workspace) {
match settings_toml(directory) {
Ok(Some(pyproject)) => {
match ruff_workspace::resolver::resolve_root_settings(
@ -156,6 +176,26 @@ impl RuffSettingsIndex {
let fallback = Arc::new(RuffSettings::fallback(editor_settings, root));
// If this is the default workspace, the server is running in single-file mode. What this
// means is that the user opened a file directly (not the folder) in the editor and the
// server didn't receive a workspace folder during initialization. In this case, we default
// to the current working directory and skip walking the workspace directory tree for any
// settings.
//
// Refer to https://github.com/astral-sh/ruff/pull/13770 to understand what this behavior
// means for different editors.
if is_default_workspace {
if has_error {
let root = root.display();
show_err_msg!(
"Error while resolving settings from workspace {root}. \
Please refer to the logs for more details.",
);
}
return RuffSettingsIndex { index, fallback };
}
// Add any settings within the workspace itself
let mut builder = WalkBuilder::new(root);
builder.standard_filters(
@ -266,7 +306,7 @@ impl RuffSettingsIndex {
);
}
Self {
RuffSettingsIndex {
index: index.into_inner().unwrap(),
fallback,
}