diff --git a/ARCHITECTURE_INSIGHTS.md b/ARCHITECTURE_INSIGHTS.md new file mode 100644 index 0000000..b3dcdf0 --- /dev/null +++ b/ARCHITECTURE_INSIGHTS.md @@ -0,0 +1,96 @@ +# Architecture Insights from Ruff Investigation + +## Key Discovery: Two-Layer Architecture + +### The Problem +- LSP documents change frequently (every keystroke) +- Salsa invalidation is expensive +- Tower-lsp requires Send+Sync, but Salsa Database contains RefCell/UnsafeCell + +### The Solution (Ruff Pattern) + +#### Layer 1: LSP Document Management (Outside Salsa) +- Store overlays in `Session` using `Arc>` +- TextDocument contains actual content, version, language_id +- Changes are immediate, no Salsa invalidation + +#### Layer 2: Salsa Incremental Computation +- Database is pure Salsa, no file storage +- Queries read through FileSystem trait +- LspFileSystem intercepts reads, returns overlay or disk content + +### Critical Insights + +1. **Overlays NEVER become Salsa inputs directly** + - They're intercepted at FileSystem::read_to_string() time + - Salsa only knows "something changed", reads content lazily + +2. **StorageHandle Pattern (for tower-lsp)** + - Session stores `StorageHandle` not Database directly + - StorageHandle IS Send+Sync even though Database isn't + - Create Database instances on-demand: `session.db()` + +3. **File Management Location** + - WRONG: Store files in Database (what we initially did) + - RIGHT: Store overlays in Session, Database is pure Salsa + +4. **The Bridge** + - LspFileSystem has Arc to same overlays as Session + - When Salsa queries need content, they call FileSystem + - FileSystem checks overlays first, falls back to disk + +### Implementation Flow + +1. **did_open/did_change/did_close** → Update overlays in Session +2. **notify_file_changed()** → Tell Salsa something changed +3. **Salsa query executes** → Calls FileSystem::read_to_string() +4. **LspFileSystem intercepts** → Returns overlay if exists, else disk +5. **Query gets content** → Without knowing about LSP/overlays + +### Why This Works + +- Fast: Overlay updates don't trigger Salsa invalidation cascade +- Thread-safe: DashMap for overlays, StorageHandle for Database +- Clean separation: LSP concerns vs computation concerns +- Efficient: Salsa caching still works, just reads through FileSystem + +### Tower-lsp vs lsp-server + +- **Ruff uses lsp-server**: No Send+Sync requirement, can store Database directly +- **We use tower-lsp**: Requires Send+Sync, must use StorageHandle pattern +- Both achieve same result, different mechanisms + +## Critical Implementation Details (From Ruff Expert) + +### The Revision Dependency Trick + +**THE MOST CRITICAL INSIGHT**: In the `source_text` tracked function, calling `file.revision(db)` is what creates the Salsa dependency chain: + +```rust +#[salsa::tracked] +pub fn source_text(db: &dyn Db, file: SourceFile) -> Arc { + // THIS LINE IS CRITICAL - Creates Salsa dependency on revision! + let _ = file.revision(db); + + // Now read from FileSystem (checks overlays first) + db.read_file_content(file.path(db)) +} +``` + +Without that `file.revision(db)` call, revision changes won't trigger invalidation! + +### Key Implementation Points + +1. **Files have no text**: SourceFile inputs only have `path` and `revision`, never `text` +2. **Revision bumping triggers invalidation**: Change revision → source_text invalidated → dependent queries invalidated +3. **Files created lazily**: Don't pre-create, let them be created on first access +4. **Simple counters work**: Revision can be a simple u64 counter, doesn't need timestamps +5. **StorageHandle update required**: After DB modifications in LSP handlers, must update the handle + +### Common Pitfalls + +- **Forgetting the revision dependency** - Without `file.revision(db)`, nothing invalidates +- **Storing text in Salsa inputs** - Breaks the entire pattern +- **Not bumping revision on overlay changes** - Queries won't see new content +- **Creating files eagerly** - Unnecessary and inefficient + diff --git a/REVISION_TRACKING_ARCHITECTURE.md b/REVISION_TRACKING_ARCHITECTURE.md new file mode 100644 index 0000000..202c118 --- /dev/null +++ b/REVISION_TRACKING_ARCHITECTURE.md @@ -0,0 +1,341 @@ +# Revision Tracking Architecture for Django Language Server + +## Overview + +This document captures the complete understanding of how to implement revision tracking for task-112, based on extensive discussions with a Ruff architecture expert. The goal is to connect the Session's overlay system with Salsa's query invalidation mechanism through per-file revision tracking. + +## The Critical Breakthrough: Dual-Layer Architecture + +### The Confusion We Had + +We conflated two different concepts: +1. **Database struct** - The Rust struct that implements the Salsa database trait +2. **Salsa database** - The actual Salsa storage system with inputs/queries + +### The Key Insight + +**Database struct ≠ Salsa database** + +The Database struct can contain: +- Salsa storage (the actual Salsa database) +- Additional non-Salsa data structures (like file tracking) + +## The Architecture Pattern (From Ruff) + +### Ruff's Implementation + +```rust +// Ruff's Database contains BOTH Salsa and non-Salsa data +pub struct ProjectDatabase { + storage: salsa::Storage, // Salsa's data + files: Files, // NOT Salsa data, but in Database struct! +} + +// Files is Arc-wrapped for cheap cloning +#[derive(Clone)] +pub struct Files { + inner: Arc, // Shared across clones +} + +struct FilesInner { + system_by_path: FxDashMap, // Thread-safe +} +``` + +### Our Implementation + +```rust +// Django LS Database structure +#[derive(Clone)] +pub struct Database { + storage: salsa::Storage, + files: Arc>, // Arc makes cloning cheap! +} + +// Session still uses StorageHandle for tower-lsp +pub struct Session { + db_handle: StorageHandle, // Still needed! + overlays: Arc>, // LSP document state +} +``` + +## Why This Works with Send+Sync Requirements + +1. **Arc is Send+Sync** - Thread-safe by design +2. **Cloning is cheap** - Only clones the Arc pointer (8 bytes) +3. **Persistence across clones** - All clones share the same DashMap +4. **StorageHandle compatible** - Database remains clonable and Send+Sync + +## Implementation Details + +### 1. Database Implementation + +```rust +impl Database { + pub fn get_or_create_file(&mut self, path: PathBuf) -> SourceFile { + self.files + .entry(path.clone()) + .or_insert_with(|| { + // Create Salsa input with initial revision 0 + SourceFile::new(self, path, 0) + }) + .clone() + } +} + +impl Clone for Database { + fn clone(&self) -> self { + Self { + storage: self.storage.clone(), // Salsa handles this + files: self.files.clone(), // Just clones Arc! + } + } +} +``` + +### 2. The Critical Pattern for Every Overlay Change + +```rust +pub fn handle_overlay_change(session: &mut Session, url: Url, content: String) { + // 1. Extract database from StorageHandle + let mut db = session.db_handle.get(); + + // 2. Update overlay in Session + session.overlays.insert(url.clone(), TextDocument::new(content)); + + // 3. Get or create file in Database + let path = path_from_url(&url); + let file = db.get_or_create_file(path); + + // 4. Bump revision (simple incrementing counter) + let current_rev = file.revision(&db); + file.set_revision(&mut db).to(current_rev + 1); + + // 5. Update StorageHandle with modified database + session.db_handle.update(db); // CRITICAL! +} +``` + +### 3. LSP Handler Updates + +#### did_open + +```rust +pub fn did_open(&mut self, params: DidOpenTextDocumentParams) { + let mut db = self.session.db_handle.get(); + + // Set overlay + self.session.overlays.insert( + params.text_document.uri.clone(), + TextDocument::new(params.text_document.text) + ); + + // Create file with initial revision 0 + let path = path_from_url(¶ms.text_document.uri); + db.get_or_create_file(path); // Creates with revision 0 + + self.session.db_handle.update(db); +} +``` + +#### did_change + +```rust +pub fn did_change(&mut self, params: DidChangeTextDocumentParams) { + let mut db = self.session.db_handle.get(); + + // Update overlay + let new_content = params.content_changes[0].text.clone(); + self.session.overlays.insert( + params.text_document.uri.clone(), + TextDocument::new(new_content) + ); + + // Bump revision + let path = path_from_url(¶ms.text_document.uri); + let file = db.get_or_create_file(path); + let new_rev = file.revision(&db) + 1; + file.set_revision(&mut db).to(new_rev); + + self.session.db_handle.update(db); +} +``` + +#### did_close + +```rust +pub fn did_close(&mut self, params: DidCloseTextDocumentParams) { + let mut db = self.session.db_handle.get(); + + // Remove overlay + self.session.overlays.remove(¶ms.text_document.uri); + + // Bump revision to trigger re-read from disk + let path = path_from_url(¶ms.text_document.uri); + if let Some(file) = db.files.get(&path) { + let new_rev = file.revision(&db) + 1; + file.set_revision(&mut db).to(new_rev); + } + + self.session.db_handle.update(db); +} +``` + +## Key Implementation Guidelines from Ruff Expert + +### 1. File Tracking Location + +- Store in Database struct (not Session) +- Use Arc for thread-safety and cheap cloning +- This keeps file tracking close to where it's used + +### 2. Revision Management + +- Use simple incrementing counter per file (not timestamps) +- Each file has independent revision tracking +- Revision just needs to change, doesn't need to be monotonic +- Example: `file.set_revision(&mut db).to(current + 1)` + +### 3. Lazy File Creation + +Files should be created: +- On did_open (via get_or_create_file) +- On first query access if needed +- NOT eagerly for all possible files + +### 4. File Lifecycle + +- **On open**: Create file with revision 0 +- **On change**: Bump revision to trigger invalidation +- **On close**: Keep file alive, bump revision for re-read from disk +- **Never remove**: Files stay in tracking even after close + +### 5. Batch Changes for Performance + +When possible, batch multiple changes: + +```rust +pub fn apply_batch_changes(&mut self, changes: Vec) { + let mut db = self.session.db_handle.get(); + + for change in changes { + // Process each change + let file = db.get_or_create_file(change.path); + file.set_revision(&mut db).to(file.revision(&db) + 1); + } + + // Single StorageHandle update at the end + self.session.db_handle.update(db); +} +``` + +### 6. Thread Safety with DashMap + +Use DashMap's atomic entry API: + +```rust +self.files.entry(path.clone()) + .and_modify(|file| { + // Modify existing + file.set_revision(db).to(new_rev); + }) + .or_insert_with(|| { + // Create new + SourceFile::builder(path) + .revision(0) + .new(db) + }); +``` + +## Critical Pitfalls to Avoid + +1. **NOT BUMPING REVISION** - Every overlay change MUST bump revision or Salsa won't invalidate +2. **FORGETTING STORAGEHANDLE UPDATE** - Must call `session.db_handle.update(db)` after changes +3. **CREATING FILES EAGERLY** - Let files be created lazily on first access +4. **USING TIMESTAMPS** - Simple incrementing counter is sufficient +5. **REMOVING FILES** - Keep files alive even after close, just bump revision + +## The Two-Layer Model + +### Layer 1: Non-Salsa (but in Database struct) +- `Arc>` - File tracking +- Thread-safe via Arc+DashMap +- Cheap to clone via Arc +- Acts as a lookup table + +### Layer 2: Salsa Inputs +- `SourceFile` entities created via `SourceFile::new(db)` +- Have revision fields for invalidation +- Tracked by Salsa's dependency system +- Invalidation cascades through dependent queries + +## Complete Architecture Summary + +| Component | Contains | Purpose | +|-----------|----------|---------| +| **Database** | `storage` + `Arc>` | Salsa queries + file tracking | +| **Session** | `StorageHandle` + `Arc>` | LSP state + overlays | +| **StorageHandle** | `Arc>>` | Bridge for tower-lsp lifetime requirements | +| **SourceFile** | Salsa input with path + revision | Triggers query invalidation | + +## The Flow + +1. **LSP request arrives** → tower-lsp handler +2. **Extract database** → `db = session.db_handle.get()` +3. **Update overlay** → `session.overlays.insert(url, content)` +4. **Get/create file** → `db.get_or_create_file(path)` +5. **Bump revision** → `file.set_revision(&mut db).to(current + 1)` +6. **Update handle** → `session.db_handle.update(db)` +7. **Salsa invalidates** → `source_text` query re-executes +8. **Queries see new content** → Through overlay-aware FileSystem + +## Why StorageHandle is Still Essential + +1. **tower-lsp requirement**: Needs 'static lifetime for async handlers +2. **Snapshot management**: Safe extraction and update of database +3. **Thread safety**: Bridges async boundaries safely +4. **Atomic updates**: Ensures consistent state transitions + +## Testing Strategy + +1. **Revision bumping**: Verify each overlay operation bumps revision +2. **Invalidation cascade**: Ensure source_text re-executes after revision bump +3. **Thread safety**: Concurrent overlay updates work correctly +4. **Clone behavior**: Database clones share the same file tracking +5. **Lazy creation**: Files only created when accessed + +## Implementation Checklist + +- [ ] Add `Arc>` to Database struct +- [ ] Implement Clone for Database (clone both storage and Arc) +- [ ] Create `get_or_create_file` method using atomic entry API +- [ ] Update did_open to create files with revision 0 +- [ ] Update did_change to bump revision after overlay update +- [ ] Update did_close to bump revision (keep file alive) +- [ ] Ensure StorageHandle updates after all database modifications +- [ ] Add tests for revision tracking and invalidation + +## Questions That Were Answered + +1. **Q: Files in Database or Session?** + A: In Database, but Arc-wrapped for cheap cloning + +2. **Q: How does this work with Send+Sync?** + A: Arc is Send+Sync, making Database clonable and thread-safe + +3. **Q: Do we still need StorageHandle?** + A: YES! It bridges tower-lsp's lifetime requirements + +4. **Q: Timestamp or counter for revisions?** + A: Simple incrementing counter per file + +5. **Q: Remove files on close?** + A: No, keep them alive and bump revision for re-read + +## The Key Insight + +Database struct is a container that holds BOTH: +- Salsa storage (for queries and inputs) +- Non-Salsa data (file tracking via Arc) + +Arc makes the non-Salsa data cheap to clone while maintaining Send+Sync compatibility. This is the pattern Ruff uses and what we should implement. \ No newline at end of file diff --git a/RUFF_ARCHITECTURE_INSIGHTS.md b/RUFF_ARCHITECTURE_INSIGHTS.md new file mode 100644 index 0000000..f4a0bf1 --- /dev/null +++ b/RUFF_ARCHITECTURE_INSIGHTS.md @@ -0,0 +1,77 @@ +# OUTDATED - See ARCHITECTURE_INSIGHTS.md for current solution + +## This document is preserved for historical context but is OUTDATED +## We found the StorageHandle solution that solves the Send+Sync issue + +# Critical Discovery: The Tower-LSP vs lsp-server Architectural Mismatch + +## The Real Problem + +Your Ruff expert friend is correct. The fundamental issue is: + +### What We Found: + +1. **Salsa commit a3ffa22 uses `RefCell` and `UnsafeCell`** - These are inherently not `Sync` +2. **Tower-LSP requires `Sync`** - Because handlers take `&self` in async contexts +3. **Ruff uses `lsp-server`** - Which doesn't require `Sync` on the server struct + +### The Mystery: + +Your expert suggests Ruff's database IS `Send + Sync`, but our testing shows that with the same Salsa commit, the database contains: +- `RefCell` (not Sync) +- `UnsafeCell>` (not Sync) + +## Possible Explanations: + +### Theory 1: Ruff Has Custom Patches +Ruff might have additional patches or workarounds not visible in the commit hash. + +### Theory 2: Different Usage Pattern +Ruff might structure their database differently to avoid the Sync requirement entirely. + +### Theory 3: lsp-server Architecture +Since Ruff uses `lsp-server` (not `tower-lsp`), they might never need the database to be Sync: +- They clone the database for background work (requires Send only) +- The main thread owns the database, background threads get clones +- No shared references across threads + +## Verification Needed: + +1. **Check if Ruff's database is actually Sync**: + - Look for unsafe impl Sync in their codebase + - Check if they wrap the database differently + +2. **Understand lsp-server's threading model**: + - How does it handle async without requiring Sync? + - What's the message passing pattern? + +## Solution Decision Matrix (Updated): + +| Solution | Effort | Performance | Risk | Compatibility | +|----------|---------|------------|------|---------------| +| **Switch to lsp-server** | High | High | Medium | Perfect Ruff parity | +| **Actor Pattern** | Medium | Medium | Low | Works with tower-lsp | +| **Arc** | Low | Poor | Low | Works but slow | +| **Unsafe Sync wrapper** | Low | High | Very High | Dangerous | +| **Database per request** | Medium | Poor | Low | Loses memoization | + +## Recommended Action Plan: + +### Immediate (Today): +1. Verify that Salsa a3ffa22 truly has RefCell/UnsafeCell +2. Check if there are any Ruff-specific patches to Salsa +3. Test the actor pattern as a better alternative to Arc + +### Short-term (This Week): +1. Implement actor pattern if Salsa can't be made Sync +2. OR investigate unsafe Sync wrapper with careful single-threaded access guarantees + +### Long-term (This Month): +1. Consider migrating to lsp-server for full Ruff compatibility +2. OR contribute Sync support to Salsa upstream + +## The Key Insight: + +**Tower-LSP's architecture is fundamentally incompatible with Salsa's current design.** + +Ruff avoided this by using `lsp-server`, which has a different threading model that doesn't require Sync on the database. diff --git a/check_ruff_pattern.md b/check_ruff_pattern.md new file mode 100644 index 0000000..5253b69 --- /dev/null +++ b/check_ruff_pattern.md @@ -0,0 +1,94 @@ +# OUTDATED - See ARCHITECTURE_INSIGHTS.md for current solution + +## This document is preserved for historical context but is OUTDATED +## We found the StorageHandle solution that solves the Send+Sync issue + +# Key Findings from Ruff's Architecture + +Based on the exploration, here's what we discovered: + +## Current Django LS Architecture + +### What We Have: +1. `Database` struct with `#[derive(Clone)]` and Salsa storage +2. `WorkspaceDatabase` that wraps `Database` and uses `DashMap` for thread-safe file storage +3. `Session` that owns `WorkspaceDatabase` directly (not wrapped in Arc) +4. Tower-LSP server that requires `Send + Sync` for async handlers + +### The Problem: +- `Database` is not `Sync` due to `RefCell` and `UnsafeCell` in Salsa's `ZalsaLocal` +- This prevents `Session` from being `Sync`, which breaks tower-lsp async handlers + +## Ruff's Solution (From Analysis) + +### They Don't Make Database Sync! +The key insight is that Ruff **doesn't actually make the database Send + Sync**. Instead: + +1. **Clone for Background Work**: They clone the database for each background task +2. **Move Not Share**: The cloned database is *moved* into the task (requires Send, not Sync) +3. **Message Passing**: Results are sent back via channels + +### Critical Difference: +- Ruff uses a custom LSP implementation that doesn't require `Sync` on the session +- Tower-LSP *does* require `Sync` because handlers take `&self` + +## The Real Problem + +Tower-LSP's `LanguageServer` trait requires: +```rust +async fn initialize(&self, ...) -> ... +// ^^^^^ This requires self to be Sync! +``` + +But with Salsa's current implementation, the Database can never be Sync. + +## Solution Options + +### Option 1: Wrap Database in Arc (Current Workaround) +```rust +pub struct Session { + database: Arc>, + // ... +} +``` +Downsides: Lock contention, defeats purpose of Salsa's internal optimization + +### Option 2: Move Database Out of Session +```rust +pub struct Session { + // Don't store database here + file_index: Arc>, + settings: Settings, +} + +// Create database on demand for each request +impl LanguageServer for Server { + async fn some_handler(&self) { + let db = create_database_from_index(&self.session.file_index); + // Use db for this request + } +} +``` + +### Option 3: Use Actor Pattern +```rust +pub struct DatabaseActor { + database: WorkspaceDatabase, + rx: mpsc::Receiver, +} + +pub struct Session { + db_tx: mpsc::Sender, +} +``` + +### Option 4: Custom unsafe Send/Sync implementation +This is risky but possible if we ensure single-threaded access patterns. + +## The Salsa Version Mystery + +We're using the exact same Salsa commit as Ruff, with the same features. The issue is NOT the Salsa version, but how tower-lsp forces us to use it. + +Ruff likely either: +1. Doesn't use tower-lsp (has custom implementation) +2. Or structures their server differently to avoid needing Sync on the database diff --git a/crates/djls-server/src/server.rs b/crates/djls-server/src/server.rs index e002658..daa64ac 100644 --- a/crates/djls-server/src/server.rs +++ b/crates/djls-server/src/server.rs @@ -6,6 +6,7 @@ use tower_lsp_server::jsonrpc::Result as LspResult; use tower_lsp_server::lsp_types; use tower_lsp_server::LanguageServer; use tracing_appender::non_blocking::WorkerGuard; +use url::Url; use crate::queue::Queue; use crate::session::Session; @@ -202,13 +203,19 @@ impl LanguageServer for DjangoLanguageServer { async fn did_open(&self, params: lsp_types::DidOpenTextDocumentParams) { tracing::info!("Opened document: {:?}", params.text_document.uri); - self.with_session_mut(|_session| { - // TODO: Handle document open after refactoring - let _uri = params.text_document.uri.clone(); - let _version = params.text_document.version; - let _language_id = + self.with_session_mut(|session| { + // Convert LSP types to our types + let url = + Url::parse(¶ms.text_document.uri.to_string()).expect("Valid URI from LSP"); + let language_id = djls_workspace::LanguageId::from(params.text_document.language_id.as_str()); - let _text = params.text_document.text.clone(); + let document = djls_workspace::TextDocument::new( + params.text_document.text, + params.text_document.version, + language_id, + ); + + session.open_document(url, document); }) .await; } @@ -216,11 +223,29 @@ impl LanguageServer for DjangoLanguageServer { async fn did_change(&self, params: lsp_types::DidChangeTextDocumentParams) { tracing::info!("Changed document: {:?}", params.text_document.uri); - self.with_session_mut(|_session| { - // TODO: Handle document change after refactoring - let _uri = ¶ms.text_document.uri; - let _version = params.text_document.version; - let _changes = params.content_changes.clone(); + self.with_session_mut(|session| { + let url = + Url::parse(¶ms.text_document.uri.to_string()).expect("Valid URI from LSP"); + let new_version = params.text_document.version; + let changes = params.content_changes; + + if let Some(mut document) = session.get_overlay(&url) { + document.update(changes, new_version); + session.update_document(url, document); + } else { + // No existing overlay - shouldn't normally happen + tracing::warn!("Received change for document without overlay: {}", url); + + // Handle full content changes only for recovery + if let Some(change) = changes.into_iter().next() { + let document = djls_workspace::TextDocument::new( + change.text, + new_version, + djls_workspace::LanguageId::Other, + ); + session.update_document(url, document); + } + } }) .await; } @@ -228,19 +253,60 @@ impl LanguageServer for DjangoLanguageServer { async fn did_close(&self, params: lsp_types::DidCloseTextDocumentParams) { tracing::info!("Closed document: {:?}", params.text_document.uri); - self.with_session_mut(|_session| { - // TODO: Handle document close after refactoring - let _uri = ¶ms.text_document.uri; + self.with_session_mut(|session| { + let url = + Url::parse(¶ms.text_document.uri.to_string()).expect("Valid URI from LSP"); + + if session.close_document(&url).is_none() { + tracing::warn!("Attempted to close document without overlay: {}", url); + } }) .await; } async fn completion( &self, - _params: lsp_types::CompletionParams, + params: lsp_types::CompletionParams, ) -> LspResult> { - // TODO: Handle completion after refactoring - Ok(None) + let response = self + .with_session(|session| { + let lsp_uri = ¶ms.text_document_position.text_document.uri; + let url = Url::parse(&lsp_uri.to_string()).expect("Valid URI from LSP"); + let position = params.text_document_position.position; + + tracing::debug!("Completion requested for {} at {:?}", url, position); + + // Check if we have an overlay for this document + if let Some(document) = session.get_overlay(&url) { + tracing::debug!("Using overlay content for completion in {}", url); + + // Use the overlay content for completion + // For now, we'll return None, but this is where completion logic would go + // The key point is that we're using overlay content, not disk content + let _content = document.content(); + let _version = document.version(); + + // TODO: Implement actual completion logic using overlay content + // This would involve: + // 1. Getting context around the cursor position + // 2. Analyzing the Django template or Python content + // 3. Returning appropriate completions + + None + } else { + tracing::debug!("No overlay found for {}, using disk content", url); + + // No overlay - would use disk content via the file system + // The LspFileSystem will automatically fall back to disk + // when no overlay is available + + // TODO: Implement completion using file system content + None + } + }) + .await; + + Ok(response) } async fn did_change_configuration(&self, _params: lsp_types::DidChangeConfigurationParams) { diff --git a/crates/djls-server/src/session.rs b/crates/djls-server/src/session.rs index beae4ec..6b4adbb 100644 --- a/crates/djls-server/src/session.rs +++ b/crates/djls-server/src/session.rs @@ -1,15 +1,75 @@ -use std::collections::HashMap; -use std::path::PathBuf; +//! # Salsa StorageHandle Pattern for LSP +//! +//! This module implements a thread-safe Salsa database wrapper for use with +//! tower-lsp's async runtime. The key challenge is that tower-lsp requires +//! `Send + Sync + 'static` bounds, but Salsa's `Storage` contains thread-local +//! state and is not `Send`. +//! +//! ## The Solution: StorageHandle +//! +//! Salsa provides `StorageHandle` which IS `Send + Sync` because it contains +//! no thread-local state. We store the handle and create `Storage`/`Database` +//! instances on-demand. +//! +//! ## The Mutation Challenge +//! +//! When mutating Salsa inputs (e.g., updating file revisions), Salsa must +//! ensure exclusive access to prevent race conditions. It does this via +//! `cancel_others()` which: +//! +//! 1. Sets a cancellation flag (causes other threads to panic with `Cancelled`) +//! 2. Waits for all `StorageHandle` clones to drop +//! 3. Proceeds with the mutation +//! +//! If we accidentally clone the handle instead of taking ownership, step 2 +//! never completes → deadlock! +//! +//! ## The Pattern +//! +//! - **Reads**: Clone the handle freely (`with_db`) +//! - **Mutations**: Take exclusive ownership (`with_db_mut` via `take_db_handle_for_mutation`) +//! +//! The explicit method names make the intent clear and prevent accidental misuse. + +use std::path::{Path, PathBuf}; use std::sync::Arc; +use dashmap::DashMap; use djls_conf::Settings; use djls_project::DjangoProject; -use djls_workspace::{FileSystem, StdFileSystem, db::Database}; +use djls_workspace::{ + db::{Database, SourceFile}, + FileSystem, OsFileSystem, TextDocument, WorkspaceFileSystem, +}; use percent_encoding::percent_decode_str; -use salsa::StorageHandle; +use salsa::{Setter, StorageHandle}; use tower_lsp_server::lsp_types; use url::Url; +/// LSP Session with thread-safe Salsa database access. +/// +/// Uses Salsa's `StorageHandle` pattern to maintain `Send + Sync + 'static` +/// compatibility required by tower-lsp. The handle can be safely shared +/// across threads and async boundaries. +/// +/// See [this Salsa Zulip discussion](https://salsa.zulipchat.com/#narrow/channel/145099-Using-Salsa/topic/.E2.9C.94.20Advice.20on.20using.20salsa.20from.20Sync.20.2B.20Send.20context/with/495497515) +/// for more information about `StorageHandle`. +/// +/// ## Architecture +/// +/// Two-layer system inspired by Ruff/Ty: +/// - **Layer 1**: In-memory overlays (LSP document edits) +/// - **Layer 2**: Salsa database (incremental computation cache) +/// +/// ## Salsa Mutation Protocol +/// +/// When mutating Salsa inputs (like changing file revisions), we must ensure +/// exclusive access to prevent race conditions. Salsa enforces this through +/// its `cancel_others()` mechanism, which waits for all `StorageHandle` clones +/// to drop before allowing mutations. +/// +/// We use explicit methods (`take_db_handle_for_mutation`/`restore_db_handle`) +/// to make this ownership transfer clear and prevent accidental deadlocks. pub struct Session { /// The Django project configuration project: Option, @@ -17,48 +77,82 @@ pub struct Session { /// LSP server settings settings: Settings, - /// A thread-safe Salsa database handle that can be shared between threads. + /// Layer 1: Thread-safe overlay storage (Arc>) /// - /// This implements the insight from [this Salsa Zulip discussion](https://salsa.zulipchat.com/#narrow/channel/145099-Using-Salsa/topic/.E2.9C.94.20Advice.20on.20using.20salsa.20from.20Sync.20.2B.20Send.20context/with/495497515) - /// where we're using the `StorageHandle` to create a thread-safe handle that can be - /// shared between threads. When we need to use it, we clone the handle to get a new reference. + /// This implements Ruff's two-layer architecture where Layer 1 contains + /// LSP overlays that take precedence over disk files. The overlays map + /// document URLs to TextDocuments containing current in-memory content. /// - /// This handle allows us to create database instances as needed. - /// Even though we're using a single-threaded runtime, we still need - /// this to be thread-safe because of LSP trait requirements. - /// - /// Usage: - /// ```rust,ignore - /// // Clone the StorageHandle for use in an async context - /// let db_handle = session.db_handle.clone(); - /// - /// // Use it in an async context - /// async_fn(move || { - /// // Get a database from the handle - /// let storage = db_handle.into_storage(); - /// let db = Database::from_storage(storage); - /// - /// // Use the database - /// db.some_query(args) - /// }); - /// ``` - db_handle: StorageHandle, + /// Key properties: + /// - Thread-safe via Arc for Send+Sync requirements + /// - Contains full TextDocument with content, version, and metadata + /// - Never becomes Salsa inputs - only intercepted at read time + overlays: Arc>, - /// File system abstraction for reading files + /// File system abstraction with overlay interception + /// + /// This LspFileSystem bridges Layer 1 (overlays) and Layer 2 (Salsa). + /// It intercepts FileSystem::read_to_string() calls to return overlay + /// content when available, falling back to disk otherwise. file_system: Arc, - /// Index of open documents with overlays (in-memory changes) - /// Maps document URL to its current content - overlays: HashMap, - - /// Tracks the session revision for change detection - revision: u64, + /// Shared file tracking across all Database instances + /// + /// This is the canonical Salsa pattern from the lazy-input example. + /// The DashMap provides O(1) lookups and is shared via Arc across + /// all Database instances created from StorageHandle. + files: Arc>, #[allow(dead_code)] client_capabilities: lsp_types::ClientCapabilities, + + /// Layer 2: Thread-safe Salsa database handle for pure computation + /// + /// where we're using the `StorageHandle` to create a thread-safe handle that can be + /// shared between threads. + /// + /// The database receives file content via the FileSystem trait, which + /// is intercepted by our LspFileSystem to provide overlay content. + /// This maintains proper separation between Layer 1 and Layer 2. + db_handle: StorageHandle, } impl Session { + pub fn new(params: &lsp_types::InitializeParams) -> Self { + let project_path = Self::get_project_path(params); + + let (project, settings) = if let Some(path) = &project_path { + let settings = + djls_conf::Settings::new(path).unwrap_or_else(|_| djls_conf::Settings::default()); + + let project = Some(djls_project::DjangoProject::new(path.clone())); + + (project, settings) + } else { + (None, Settings::default()) + }; + + let overlays = Arc::new(DashMap::new()); + let files = Arc::new(DashMap::new()); + let file_system = Arc::new(WorkspaceFileSystem::new( + overlays.clone(), + Arc::new(OsFileSystem), + )); + let db_handle = Database::new(file_system.clone(), files.clone()) + .storage() + .clone() + .into_zalsa_handle(); + + Self { + project, + settings, + overlays, + file_system, + files, + client_capabilities: params.capabilities.clone(), + db_handle, + } + } /// Determines the project root path from initialization parameters. /// /// Tries the current directory first, then falls back to the first workspace folder. @@ -97,31 +191,6 @@ impl Session { Some(PathBuf::from(path_str)) } - pub fn new(params: &lsp_types::InitializeParams) -> Self { - let project_path = Self::get_project_path(params); - - let (project, settings) = if let Some(path) = &project_path { - let settings = - djls_conf::Settings::new(path).unwrap_or_else(|_| djls_conf::Settings::default()); - - let project = Some(djls_project::DjangoProject::new(path.clone())); - - (project, settings) - } else { - (None, Settings::default()) - }; - - Self { - client_capabilities: params.capabilities.clone(), - project, - settings, - db_handle: StorageHandle::new(None), - file_system: Arc::new(StdFileSystem), - overlays: HashMap::new(), - revision: 0, - } - } - pub fn project(&self) -> Option<&DjangoProject> { self.project.as_ref() } @@ -130,8 +199,6 @@ impl Session { &mut self.project } - - pub fn settings(&self) -> &Settings { &self.settings } @@ -144,23 +211,457 @@ impl Session { /// /// This creates a usable database from the handle, which can be used /// to query and update data. The database itself is not Send/Sync, - /// but the StorageHandle is, allowing us to work with tower-lsp. + /// but the `StorageHandle` is, allowing us to work with tower-lsp-server. + /// + /// The database will read files through the LspFileSystem, which + /// automatically returns overlay content when available. + /// + /// CRITICAL: We pass the shared files Arc to preserve file tracking + /// across Database reconstructions from StorageHandle. + #[allow(dead_code)] pub fn db(&self) -> Database { let storage = self.db_handle.clone().into_storage(); - Database::from_storage(storage) + Database::from_storage(storage, self.file_system.clone(), self.files.clone()) + } + + /// Get access to the file system (for Salsa integration) + #[allow(dead_code)] + pub fn file_system(&self) -> Arc { + self.file_system.clone() + } + + /// Set or update an overlay for the given document URL + /// + /// This implements Layer 1 of Ruff's architecture - storing in-memory + /// document changes that take precedence over disk content. + #[allow(dead_code)] // Used in tests + pub fn set_overlay(&self, url: Url, document: TextDocument) { + self.overlays.insert(url, document); + } + + /// Remove an overlay for the given document URL + /// + /// After removal, file reads will fall back to disk content. + #[allow(dead_code)] // Used in tests + pub fn remove_overlay(&self, url: &Url) -> Option { + self.overlays.remove(url).map(|(_, doc)| doc) + } + + /// Check if an overlay exists for the given URL + #[allow(dead_code)] + pub fn has_overlay(&self, url: &Url) -> bool { + self.overlays.contains_key(url) + } + + /// Get a copy of an overlay document + pub fn get_overlay(&self, url: &Url) -> Option { + self.overlays.get(url).map(|doc| doc.clone()) + } + + /// Takes exclusive ownership of the database handle for mutation operations. + /// + /// This method extracts the `StorageHandle` from the session, replacing it + /// with a temporary placeholder. This ensures there's exactly one handle + /// active during mutations, preventing deadlocks in Salsa's `cancel_others()`. + /// + /// # Why Not Clone? + /// + /// Cloning would create multiple handles. When Salsa needs to mutate inputs, + /// it calls `cancel_others()` which waits for all handles to drop. With + /// multiple handles, this wait would never complete → deadlock. + /// + /// # Panics + /// + /// This is an internal method that should only be called by `with_db_mut`. + /// Multiple concurrent calls would panic when trying to take an already-taken handle. + fn take_db_handle_for_mutation(&mut self) -> StorageHandle { + std::mem::replace(&mut self.db_handle, StorageHandle::new(None)) + } + + /// Restores the database handle after a mutation operation completes. + /// + /// This should be called with the handle extracted from the database + /// after mutations are complete. It updates the session's handle to + /// reflect any changes made during the mutation. + fn restore_db_handle(&mut self, handle: StorageHandle) { + self.db_handle = handle; + } + + /// Execute a closure with mutable access to the database. + /// + /// This method implements Salsa's required protocol for mutations: + /// 1. Takes exclusive ownership of the StorageHandle (no clones exist) + /// 2. Creates a temporary Database for the operation + /// 3. Executes your closure with `&mut Database` + /// 4. Extracts and restores the updated handle + /// + /// # Example + /// + /// ```rust,ignore + /// session.with_db_mut(|db| { + /// let file = db.get_or_create_file(path); + /// file.set_revision(db).to(new_revision); // Mutation requires exclusive access + /// }); + /// ``` + /// + /// # Why This Pattern? + /// + /// This ensures that when Salsa needs to modify inputs (via setters like + /// `set_revision`), it has exclusive access. The internal `cancel_others()` + /// call will succeed because we guarantee only one handle exists. + pub fn with_db_mut(&mut self, f: F) -> R + where + F: FnOnce(&mut Database) -> R, + { + let handle = self.take_db_handle_for_mutation(); + + let storage = handle.into_storage(); + let mut db = Database::from_storage(storage, self.file_system.clone(), self.files.clone()); + + let result = f(&mut db); + + // The database may have changed during mutations, so we need + // to extract its current handle state + let new_handle = db.storage().clone().into_zalsa_handle(); + self.restore_db_handle(new_handle); + + result + } + + /// Execute a closure with read-only access to the database. + /// + /// For read-only operations, we can safely clone the `StorageHandle` + /// since Salsa allows multiple concurrent readers. This is more + /// efficient than taking exclusive ownership. + /// + /// # Example + /// + /// ```rust,ignore + /// let content = session.with_db(|db| { + /// let file = db.get_file(path)?; + /// source_text(db, file).to_string() // Read-only query + /// }); + /// ``` + pub fn with_db(&self, f: F) -> R + where + F: FnOnce(&Database) -> R, + { + // For reads, cloning is safe and efficient + let storage = self.db_handle.clone().into_storage(); + let db = Database::from_storage(storage, self.file_system.clone(), self.files.clone()); + f(&db) + } + + /// Convert a URL to a PathBuf for file operations. + /// + /// This is needed to convert between LSP URLs and file paths for + /// SourceFile creation and tracking. + pub fn url_to_path(&self, url: &Url) -> Option { + // Only handle file:// URLs + if url.scheme() != "file" { + return None; + } + + // Decode and convert to PathBuf + let path = percent_decode_str(url.path()).decode_utf8().ok()?; + + #[cfg(windows)] + let path = path.strip_prefix('/').unwrap_or(&path); + + Some(PathBuf::from(path.as_ref())) + } + + // ===== Document Lifecycle Management ===== + // These methods encapsulate the two-layer architecture coordination: + // Layer 1 (overlays) and Layer 2 (Salsa revision tracking) + + /// Handle opening a document - sets overlay and creates file. + /// + /// This method coordinates both layers: + /// - Layer 1: Stores the document content in overlays + /// - Layer 2: Creates the SourceFile in Salsa (if path is resolvable) + pub fn open_document(&mut self, url: Url, document: TextDocument) { + tracing::debug!("Opening document: {}", url); + + // Layer 1: Set overlay + self.overlays.insert(url.clone(), document); + + // Layer 2: Create file if needed (starts at revision 0) + if let Some(path) = self.url_to_path(&url) { + self.with_db_mut(|db| { + let file = db.get_or_create_file(path.clone()); + tracing::debug!( + "Created/retrieved SourceFile for {}: revision {}", + path.display(), + file.revision(db) + ); + }); + } + } + + /// Handle document changes - updates overlay and bumps revision. + /// + /// This method coordinates both layers: + /// - Layer 1: Updates the document content in overlays + /// - Layer 2: Bumps the file revision to trigger Salsa invalidation + pub fn update_document(&mut self, url: Url, document: TextDocument) { + let version = document.version(); + tracing::debug!("Updating document: {} (version {})", url, version); + + // Layer 1: Update overlay + self.overlays.insert(url.clone(), document); + + // Layer 2: Bump revision to trigger invalidation + if let Some(path) = self.url_to_path(&url) { + self.notify_file_changed(path); + } + } + + /// Handle closing a document - removes overlay and bumps revision. + /// + /// This method coordinates both layers: + /// - Layer 1: Removes the overlay (falls back to disk) + /// - Layer 2: Bumps revision to trigger re-read from disk + /// + /// Returns the removed document if it existed. + pub fn close_document(&mut self, url: &Url) -> Option { + tracing::debug!("Closing document: {}", url); + + // Layer 1: Remove overlay + let removed = self.overlays.remove(url).map(|(_, doc)| { + tracing::debug!( + "Removed overlay for closed document: {} (was version {})", + url, + doc.version() + ); + doc + }); + + // Layer 2: Bump revision to trigger re-read from disk + // We keep the file alive for potential re-opening + if let Some(path) = self.url_to_path(url) { + self.notify_file_changed(path); + } + + removed + } + + /// Internal: Notify that a file's content has changed. + /// + /// This bumps the file's revision number in Salsa, which triggers + /// invalidation of any queries that depend on the file's content. + fn notify_file_changed(&mut self, path: PathBuf) { + self.with_db_mut(|db| { + // Only bump revision if file is already being tracked + // We don't create files just for notifications + if db.has_file(&path) { + let file = db.get_or_create_file(path.clone()); + let current_rev = file.revision(db); + let new_rev = current_rev + 1; + file.set_revision(db).to(new_rev); + tracing::debug!( + "Bumped revision for {}: {} -> {}", + path.display(), + current_rev, + new_rev + ); + } else { + tracing::debug!( + "File {} not tracked, skipping revision bump", + path.display() + ); + } + }); + } + + // ===== Safe Query API ===== + // These methods encapsulate all Salsa interactions, preventing the + // "mixed database instance" bug by never exposing SourceFile or Database. + + /// Get the current content of a file (from overlay or disk). + /// + /// This is the safe way to read file content through the system. + /// The file is created if it doesn't exist, and content is read + /// through the FileSystem abstraction (overlay first, then disk). + pub fn file_content(&mut self, path: PathBuf) -> String { + use djls_workspace::db::source_text; + + self.with_db_mut(|db| { + let file = db.get_or_create_file(path); + source_text(db, file).to_string() + }) + } + + /// Get the current revision of a file, if it's being tracked. + /// + /// Returns None if the file hasn't been created yet. + pub fn file_revision(&mut self, path: &Path) -> Option { + self.with_db_mut(|db| { + db.has_file(path).then(|| { + let file = db.get_or_create_file(path.to_path_buf()); + file.revision(db) + }) + }) + } + + /// Check if a file is currently being tracked in Salsa. + pub fn has_file(&mut self, path: &Path) -> bool { + self.with_db(|db| db.has_file(path)) } } impl Default for Session { fn default() -> Self { + let overlays = Arc::new(DashMap::new()); + let files = Arc::new(DashMap::new()); + let file_system = Arc::new(WorkspaceFileSystem::new( + overlays.clone(), + Arc::new(OsFileSystem), + )); + let db_handle = Database::new(file_system.clone(), files.clone()) + .storage() + .clone() + .into_zalsa_handle(); + Self { project: None, settings: Settings::default(), - db_handle: StorageHandle::new(None), - file_system: Arc::new(StdFileSystem), - overlays: HashMap::new(), - revision: 0, + db_handle, + file_system, + files, + overlays, client_capabilities: lsp_types::ClientCapabilities::default(), } } } + +#[cfg(test)] +mod tests { + use super::*; + use djls_workspace::LanguageId; + + #[test] + fn test_session_overlay_management() { + let session = Session::default(); + + let url = Url::parse("file:///test/file.py").unwrap(); + let document = TextDocument::new("print('hello')".to_string(), 1, LanguageId::Python); + + // Initially no overlay + assert!(!session.has_overlay(&url)); + assert!(session.get_overlay(&url).is_none()); + + // Set overlay + session.set_overlay(url.clone(), document.clone()); + assert!(session.has_overlay(&url)); + + let retrieved = session.get_overlay(&url).unwrap(); + assert_eq!(retrieved.content(), document.content()); + assert_eq!(retrieved.version(), document.version()); + + // Remove overlay + let removed = session.remove_overlay(&url).unwrap(); + assert_eq!(removed.content(), document.content()); + assert!(!session.has_overlay(&url)); + } + + #[test] + fn test_session_two_layer_architecture() { + let session = Session::default(); + + // Verify we have both layers + let _filesystem = session.file_system(); // Layer 2: FileSystem bridge + let _db = session.db(); // Layer 2: Salsa database + + // Verify overlay operations work (Layer 1) + let url = Url::parse("file:///test/integration.py").unwrap(); + let document = TextDocument::new("# Layer 1 content".to_string(), 1, LanguageId::Python); + + session.set_overlay(url.clone(), document); + assert!(session.has_overlay(&url)); + + // FileSystem should now return overlay content through LspFileSystem + // (This would be tested more thoroughly in integration tests) + } + + #[test] + fn test_revision_invalidation_chain() { + use std::path::PathBuf; + + let mut session = Session::default(); + + // Create a test file path + let path = PathBuf::from("/test/template.html"); + let url = Url::parse("file:///test/template.html").unwrap(); + + // Open document with initial content + println!("**[test]** open document with initial content"); + let document = TextDocument::new( + "

Original Content

".to_string(), + 1, + LanguageId::Other, + ); + session.open_document(url.clone(), document); + + // Try to read content - this might be where it hangs + println!("**[test]** try to read content - this might be where it hangs"); + let content1 = session.file_content(path.clone()); + assert_eq!(content1, "

Original Content

"); + + // Update document with new content + println!("**[test]** Update document with new content"); + let updated_document = + TextDocument::new("

Updated Content

".to_string(), 2, LanguageId::Other); + session.update_document(url.clone(), updated_document); + + // Read content again (should get new overlay content due to invalidation) + println!( + "**[test]** Read content again (should get new overlay content due to invalidation)" + ); + let content2 = session.file_content(path.clone()); + assert_eq!(content2, "

Updated Content

"); + assert_ne!(content1, content2); + + // Close document (removes overlay, bumps revision) + println!("**[test]** Close document (removes overlay, bumps revision)"); + session.close_document(&url); + + // Read content again (should now read from disk, which returns empty for missing files) + println!( + "**[test]** Read content again (should now read from disk, which returns empty for missing files)" + ); + let content3 = session.file_content(path.clone()); + assert_eq!(content3, ""); // No file on disk, returns empty + } + + #[test] + fn test_with_db_mut_preserves_files() { + use std::path::PathBuf; + + let mut session = Session::default(); + + // Create multiple files + let path1 = PathBuf::from("/test/file1.py"); + let path2 = PathBuf::from("/test/file2.py"); + + // Create files through safe API + session.file_content(path1.clone()); // Creates file1 + session.file_content(path2.clone()); // Creates file2 + + // Verify files are preserved across operations + assert!(session.has_file(&path1)); + assert!(session.has_file(&path2)); + + // Files should persist even after multiple operations + let content1 = session.file_content(path1.clone()); + let content2 = session.file_content(path2.clone()); + + // Both should return empty (no disk content) + assert_eq!(content1, ""); + assert_eq!(content2, ""); + + // One more verification + assert!(session.has_file(&path1)); + assert!(session.has_file(&path2)); + } +} diff --git a/crates/djls-workspace/src/bridge.rs b/crates/djls-workspace/src/bridge.rs deleted file mode 100644 index 2da695e..0000000 --- a/crates/djls-workspace/src/bridge.rs +++ /dev/null @@ -1,126 +0,0 @@ -//! Bridge between VFS snapshots and Salsa inputs. -//! -//! The bridge module isolates Salsa input mutation behind a single, idempotent API. -//! It ensures we only touch Salsa when content or classification changes, maximizing -//! incremental performance. - -use std::collections::HashMap; -use std::sync::Arc; - -use salsa::Setter; - -use super::db::parse_template; -use super::db::template_errors; -use super::db::Database; -use super::db::SourceFile; -use super::db::TemplateAst; -use super::db::TemplateLoaderOrder; -use super::FileId; -use super::FileKind; - -/// Owner of the Salsa [`Database`] plus the handles for updating inputs. -/// -/// [`FileStore`] serves as the bridge between the VFS (with [`FileId`]s) and Salsa (with entities). -/// It maintains a mapping from [`FileId`]s to [`SourceFile`] entities and manages the global -/// [`TemplateLoaderOrder`] input. The [`FileStore`] ensures that Salsa inputs are only mutated -/// when actual changes occur, preserving incremental computation efficiency. -pub struct FileStore { - /// The Salsa DB instance - pub db: Database, - /// Map from [`FileId`] to its Salsa input entity - files: HashMap, - /// Handle to the global template loader configuration input - template_loader: Option, -} - -impl FileStore { - /// Construct an empty store and DB. - #[must_use] - pub fn new() -> Self { - Self { - db: Database::new(), - files: HashMap::new(), - template_loader: None, - } - } - - /// Create or update the global template loader order input. - /// - /// Sets the ordered list of template root directories that Django will search - /// when resolving template names. If the input already exists, it updates the - /// existing value; otherwise, it creates a new [`TemplateLoaderOrder`] input. - pub fn set_template_loader_order(&mut self, ordered_roots: Vec) { - let roots = Arc::from(ordered_roots.into_boxed_slice()); - if let Some(tl) = self.template_loader { - tl.set_roots(&mut self.db).to(roots); - } else { - self.template_loader = Some(TemplateLoaderOrder::new(&self.db, roots)); - } - } - - // TODO: This will be replaced with direct file management - // pub(crate) fn apply_vfs_snapshot(&mut self, snap: &VfsSnapshot) { - // for (id, rec) in &snap.files { - // let new_text = snap.get_text(*id).unwrap_or_else(|| Arc::::from("")); - // let new_kind = rec.meta.kind; - - // if let Some(sf) = self.files.get(id) { - // // Update if changed — avoid touching Salsa when not needed - // if sf.kind(&self.db) != new_kind { - // sf.set_kind(&mut self.db).to(new_kind); - // } - // if sf.text(&self.db).as_ref() != &*new_text { - // sf.set_text(&mut self.db).to(new_text.clone()); - // } - // } else { - // let sf = SourceFile::new(&self.db, new_kind, new_text); - // self.files.insert(*id, sf); - // } - // } - // } - - /// Get the text content of a file by its [`FileId`]. - /// - /// Returns `None` if the file is not tracked in the [`FileStore`]. - pub(crate) fn file_text(&self, id: FileId) -> Option> { - self.files.get(&id).map(|sf| sf.text(&self.db).clone()) - } - - /// Get the file kind classification by its [`FileId`]. - /// - /// Returns `None` if the file is not tracked in the [`FileStore`]. - pub(crate) fn file_kind(&self, id: FileId) -> Option { - self.files.get(&id).map(|sf| sf.kind(&self.db)) - } - - /// Get the parsed template AST for a file by its [`FileId`]. - /// - /// This method leverages Salsa's incremental computation to cache parsed ASTs. - /// The AST is only re-parsed when the file's content changes in the VFS. - /// Returns `None` if the file is not tracked or is not a template file. - pub(crate) fn get_template_ast(&self, id: FileId) -> Option> { - let source_file = self.files.get(&id)?; - parse_template(&self.db, *source_file) - } - - /// Get template parsing errors for a file by its [`FileId`]. - /// - /// This method provides quick access to template errors without needing the full AST. - /// Useful for diagnostics and error reporting. Returns an empty slice for - /// non-template files or files not tracked in the store. - pub(crate) fn get_template_errors(&self, id: FileId) -> Arc<[String]> { - self.files - .get(&id) - .map_or_else(|| Arc::from(vec![]), |sf| template_errors(&self.db, *sf)) - } -} - -impl Default for FileStore { - fn default() -> Self { - Self::new() - } -} - -// TODO: Re-enable tests after VFS removal is complete -// #[cfg(test)] -// mod tests { diff --git a/crates/djls-workspace/src/db.rs b/crates/djls-workspace/src/db.rs index c542ea4..0105616 100644 --- a/crates/djls-workspace/src/db.rs +++ b/crates/djls-workspace/src/db.rs @@ -1,39 +1,100 @@ //! Salsa database and input entities for workspace. //! -//! This module defines the Salsa world—what can be set and tracked incrementally. -//! Inputs are kept minimal to avoid unnecessary recomputation. +//! This module implements a two-layer architecture inspired by Ruff's design pattern +//! for efficient LSP document management with Salsa incremental computation. +//! +//! # Two-Layer Architecture +//! +//! ## Layer 1: LSP Document Management (in Session) +//! - Stores overlays in `Session` using `Arc>` +//! - TextDocument contains actual content, version, language_id +//! - Changes are immediate, no Salsa invalidation on every keystroke +//! - Thread-safe via DashMap for tower-lsp's Send+Sync requirements +//! +//! ## Layer 2: Salsa Incremental Computation (in Database) +//! - Database is pure Salsa, no file content storage +//! - Files tracked via `Arc>` for O(1) lookups +//! - SourceFile inputs only have path and revision (no text) +//! - Content read lazily through FileSystem trait +//! - LspFileSystem intercepts reads, returns overlay or disk content +//! +//! # Critical Implementation Details +//! +//! ## The Revision Dependency Trick +//! The `source_text` tracked function MUST call `file.revision(db)` to create +//! the Salsa dependency chain. Without this, revision changes won't trigger +//! invalidation of dependent queries. +//! +//! ## StorageHandle Pattern (for tower-lsp) +//! - Database itself is NOT Send+Sync (due to RefCell in Salsa's Storage) +//! - StorageHandle IS Send+Sync, enabling use across threads +//! - Session stores StorageHandle, creates Database instances on-demand +//! +//! ## Why Files are in Database, Overlays in Session +//! - Files need persistent tracking across all queries (thus in Database) +//! - Overlays are LSP-specific and change frequently (thus in Session) +//! - This separation prevents Salsa invalidation cascades on every keystroke +//! - Both are accessed via Arc for thread safety and cheap cloning +//! +//! # Data Flow +//! +//! 1. **did_open/did_change** → Update overlays in Session +//! 2. **notify_file_changed()** → Bump revision, tell Salsa something changed +//! 3. **Salsa query executes** → Calls source_text() +//! 4. **source_text() calls file.revision(db)** → Creates dependency +//! 5. **source_text() calls db.read_file_content()** → Goes through FileSystem +//! 6. **LspFileSystem intercepts** → Returns overlay if exists, else disk +//! 7. **Query gets content** → Without knowing about LSP/overlays +//! +//! This design achieves: +//! - Fast overlay updates (no Salsa invalidation) +//! - Proper incremental computation (via revision tracking) +//! - Thread safety (via Arc and StorageHandle) +//! - Clean separation of concerns (LSP vs computation) +use std::path::{Path, PathBuf}; use std::sync::Arc; -use std::sync::atomic::{AtomicU32, Ordering}; #[cfg(test)] use std::sync::Mutex; use dashmap::DashMap; -use url::Url; -use crate::{FileId, FileKind}; +use crate::{FileKind, FileSystem}; + +/// Database trait that provides file system access for Salsa queries +#[salsa::db] +pub trait Db: salsa::Database { + /// Get the file system for reading files (with overlay support) + fn fs(&self) -> Option>; + + /// Read file content through the file system + /// This is the primary way Salsa queries should read files, as it + /// automatically checks overlays before falling back to disk. + fn read_file_content(&self, path: &Path) -> std::io::Result; +} /// Salsa database root for workspace /// /// The [`Database`] provides default storage and, in tests, captures Salsa events for /// reuse/diagnostics. It serves as the core incremental computation engine, tracking /// dependencies and invalidations across all inputs and derived queries. -/// -/// This database also manages the file system overlay for the workspace, -/// mapping URLs to FileIds and storing file content. +/// +/// The database integrates with the FileSystem abstraction to read files through +/// the LspFileSystem, which automatically checks overlays before falling back to disk. #[salsa::db] #[derive(Clone)] pub struct Database { storage: salsa::Storage, - - /// Map from file URL to FileId (thread-safe) - files: DashMap, - - /// Map from FileId to file content (thread-safe) - content: DashMap>, - - /// Next FileId to allocate (thread-safe counter) - next_file_id: Arc, + + /// FileSystem integration for reading files (with overlay support) + /// This allows the database to read files through LspFileSystem, which + /// automatically checks for overlays before falling back to disk files. + fs: Option>, + + /// File tracking outside of Salsa but within Database (Arc for cheap cloning). + /// This follows Ruff's pattern where files are tracked in the Database struct + /// but not as part of Salsa's storage, enabling cheap clones via Arc. + files: Arc>, // The logs are only used for testing and demonstrating reuse: #[cfg(test)] @@ -58,101 +119,148 @@ impl Default for Database { } } }))), - files: DashMap::new(), - content: DashMap::new(), - next_file_id: Arc::new(AtomicU32::new(0)), + fs: None, + files: Arc::new(DashMap::new()), logs, } } } impl Database { - /// Create a new database instance - pub fn new() -> Self { + /// Create a new database with fresh storage. + pub fn new(file_system: Arc, files: Arc>) -> Self { Self { storage: salsa::Storage::new(None), - files: DashMap::new(), - content: DashMap::new(), - next_file_id: Arc::new(AtomicU32::new(0)), + fs: Some(file_system), + files, #[cfg(test)] logs: Arc::new(Mutex::new(None)), } } - - /// Create a new database instance from a storage handle. - /// This is used by Session::db() to create databases from the StorageHandle. - pub fn from_storage(storage: salsa::Storage) -> Self { + + /// Create a database instance from an existing storage. + /// This preserves both the file system and files Arc across database operations. + pub fn from_storage( + storage: salsa::Storage, + file_system: Arc, + files: Arc>, + ) -> Self { Self { storage, - files: DashMap::new(), - content: DashMap::new(), - next_file_id: Arc::new(AtomicU32::new(0)), + fs: Some(file_system), + files, #[cfg(test)] logs: Arc::new(Mutex::new(None)), } } - - /// Add or update a file in the workspace - pub fn set_file(&mut self, url: Url, content: String, _kind: FileKind) { - let file_id = if let Some(existing_id) = self.files.get(&url) { - *existing_id + + /// Read file content through the file system + /// This is the primary way Salsa queries should read files, as it + /// automatically checks overlays before falling back to disk. + pub fn read_file_content(&self, path: &Path) -> std::io::Result { + if let Some(fs) = &self.fs { + fs.read_to_string(path) } else { - let new_id = FileId::from_raw(self.next_file_id.fetch_add(1, Ordering::SeqCst)); - self.files.insert(url.clone(), new_id); - new_id - }; - - let content = Arc::::from(content); - self.content.insert(file_id, content.clone()); - - // TODO: Update Salsa inputs here when we connect them - } - - /// Remove a file from the workspace - pub fn remove_file(&mut self, url: &Url) { - if let Some((_, file_id)) = self.files.remove(url) { - self.content.remove(&file_id); - // TODO: Remove from Salsa when we connect inputs + std::fs::read_to_string(path) } } - - /// Get the content of a file by URL - pub fn get_file_content(&self, url: &Url) -> Option> { - let file_id = self.files.get(url)?; - self.content.get(&*file_id).map(|content| content.clone()) + + /// Get or create a SourceFile for the given path. + /// + /// This method implements Ruff's pattern for lazy file creation. Files are created + /// with an initial revision of 0 and tracked in the Database's DashMap. The Arc + /// ensures cheap cloning while maintaining thread safety. + pub fn get_or_create_file(&mut self, path: PathBuf) -> SourceFile { + if let Some(file_ref) = self.files.get(&path) { + // Copy the value (SourceFile is Copy) and drop the guard immediately + let file = *file_ref; + drop(file_ref); // Explicitly drop the guard to release the lock + return file; + } + + // File doesn't exist, so we need to create it + let kind = FileKind::from_path(&path); + let file = SourceFile::new(self, kind, Arc::from(path.to_string_lossy().as_ref()), 0); + + self.files.insert(path.clone(), file); + file } - - /// Get the content of a file by FileId - pub(crate) fn get_content_by_id(&self, file_id: FileId) -> Option> { - self.content.get(&file_id).map(|content| content.clone()) + + /// Check if a file is being tracked without creating it. + /// + /// This is primarily used for testing to verify that files have been + /// created without affecting the database state. + pub fn has_file(&self, path: &Path) -> bool { + self.files.contains_key(path) } - - /// Check if a file exists in the workspace - pub fn has_file(&self, url: &Url) -> bool { - self.files.contains_key(url) + + /// Get a reference to the storage for handle extraction. + /// + /// This is used by Session to extract the StorageHandle after mutations. + pub fn storage(&self) -> &salsa::Storage { + &self.storage } - - /// Get all file URLs in the workspace - pub fn files(&self) -> impl Iterator + use<'_> { - self.files.iter().map(|entry| entry.key().clone()) + + /// Consume the database and return its storage. + /// + /// This is used when you need to take ownership of the storage. + pub fn into_storage(self) -> salsa::Storage { + self.storage } } #[salsa::db] impl salsa::Database for Database {} -/// Represents a single file's classification and current content. +#[salsa::db] +impl Db for Database { + fn fs(&self) -> Option> { + self.fs.clone() + } + + fn read_file_content(&self, path: &Path) -> std::io::Result { + match &self.fs { + Some(fs) => fs.read_to_string(path), + None => std::fs::read_to_string(path), // Fallback to direct disk access + } + } +} + +/// Represents a single file without storing its content. /// -/// [`SourceFile`] is a Salsa input entity that tracks both the file's type (for routing -/// to appropriate analyzers) and its current text content. The text is stored as -/// `Arc` for efficient sharing across the incremental computation graph. +/// [`SourceFile`] is a Salsa input entity that tracks a file's path, revision, and +/// classification for analysis routing. Following Ruff's pattern, content is NOT +/// stored here but read on-demand through the `source_text` tracked function. #[salsa::input] pub struct SourceFile { /// The file's classification for analysis routing pub kind: FileKind, - /// The current text content of the file + /// The file path #[returns(ref)] - pub text: Arc, + pub path: Arc, + /// The revision number for invalidation tracking + pub revision: u64, +} + +/// Read file content through the FileSystem, creating proper Salsa dependencies. +/// +/// This is the CRITICAL function that implements Ruff's two-layer architecture. +/// The call to `file.revision(db)` creates a Salsa dependency, ensuring that +/// when the revision changes, this function (and all dependent queries) are +/// invalidated and re-executed. +#[salsa::tracked] +pub fn source_text(db: &dyn Db, file: SourceFile) -> Arc { + // This line creates the Salsa dependency on revision! Without this call, + // revision changes won't trigger invalidation + let _ = file.revision(db); + + let path = Path::new(file.path(db).as_ref()); + match db.read_file_content(path) { + Ok(content) => Arc::from(content), + Err(_) => { + Arc::from("") // Return empty string for missing files + } + } } /// Global input configuring ordered template loader roots. @@ -167,6 +275,18 @@ pub struct TemplateLoaderOrder { pub roots: Arc<[String]>, } +/// Represents a file path for Salsa tracking. +/// +/// [`FilePath`] is a Salsa input entity that tracks a file path for use in +/// path-based queries. This allows Salsa to properly track dependencies +/// on files identified by path rather than by SourceFile input. +#[salsa::input] +pub struct FilePath { + /// The file path as a string + #[returns(ref)] + pub path: Arc, +} + /// Container for a parsed Django template AST. /// /// [`TemplateAst`] wraps the parsed AST from djls-templates along with any parsing errors. @@ -183,18 +303,18 @@ pub struct TemplateAst { /// Parse a Django template file into an AST. /// /// This Salsa tracked function parses template files on-demand and caches the results. -/// The parse is only re-executed when the file's text content changes, enabling -/// efficient incremental template analysis. +/// The parse is only re-executed when the file's content changes (detected via content changes). /// /// Returns `None` for non-template files. #[salsa::tracked] -pub fn parse_template(db: &dyn salsa::Database, file: SourceFile) -> Option> { +pub fn parse_template(db: &dyn Db, file: SourceFile) -> Option> { // Only parse template files if file.kind(db) != FileKind::Template { return None; } - let text = file.text(db); + let text_arc = source_text(db, file); + let text = text_arc.as_ref(); // Call the pure parsing function from djls-templates match djls_templates::parse_template(text) { @@ -216,6 +336,54 @@ pub fn parse_template(db: &dyn salsa::Database, file: SourceFile) -> Option Option> { + // Read file content through the FileSystem (checks overlays first) + let path = Path::new(file_path.path(db).as_ref()); + let Ok(text) = db.read_file_content(path) else { + return None; + }; + + // Call the pure parsing function from djls-templates + match djls_templates::parse_template(&text) { + Ok((ast, errors)) => { + // Convert errors to strings + let error_strings = errors.into_iter().map(|e| e.to_string()).collect(); + Some(Arc::new(TemplateAst { + ast, + errors: error_strings, + })) + } + Err(err) => { + // Even on fatal errors, return an empty AST with the error + Some(Arc::new(TemplateAst { + ast: djls_templates::Ast::default(), + errors: vec![err.to_string()], + })) + } + } +} + +/// Get template parsing errors for a file by path. +/// +/// This Salsa tracked function extracts just the errors from the parsed template, +/// useful for diagnostics without needing the full AST. +/// +/// Reads files through the FileSystem for overlay support. +/// +/// Returns an empty vector for non-template files. +#[salsa::tracked] +pub fn template_errors_by_path(db: &dyn Db, file_path: FilePath) -> Arc<[String]> { + parse_template_by_path(db, file_path) + .map_or_else(|| Arc::from(vec![]), |ast| Arc::from(ast.errors.clone())) +} + /// Get template parsing errors for a file. /// /// This Salsa tracked function extracts just the errors from the parsed template, @@ -223,91 +391,6 @@ pub fn parse_template(db: &dyn salsa::Database, file: SourceFile) -> Option Arc<[String]> { +pub fn template_errors(db: &dyn Db, file: SourceFile) -> Arc<[String]> { parse_template(db, file).map_or_else(|| Arc::from(vec![]), |ast| Arc::from(ast.errors.clone())) } - -#[cfg(test)] -mod tests { - use salsa::Setter; - - use super::*; - - #[test] - fn test_template_parsing_caches_result() { - let db = Database::default(); - - // Create a template file - let template_content: Arc = Arc::from("{% if user %}Hello {{ user.name }}{% endif %}"); - let file = SourceFile::new(&db, FileKind::Template, template_content.clone()); - - // First parse - should execute the parsing - let ast1 = parse_template(&db, file); - assert!(ast1.is_some()); - - // Second parse - should return cached result (same Arc) - let ast2 = parse_template(&db, file); - assert!(ast2.is_some()); - - // Verify they're the same Arc (cached) - assert!(Arc::ptr_eq(&ast1.unwrap(), &ast2.unwrap())); - } - - #[test] - fn test_template_parsing_invalidates_on_change() { - let mut db = Database::default(); - - // Create a template file - let template_content1: Arc = Arc::from("{% if user %}Hello{% endif %}"); - let file = SourceFile::new(&db, FileKind::Template, template_content1); - - // First parse - let ast1 = parse_template(&db, file); - assert!(ast1.is_some()); - - // Change the content - let template_content2: Arc = - Arc::from("{% for item in items %}{{ item }}{% endfor %}"); - file.set_text(&mut db).to(template_content2); - - // Parse again - should re-execute due to changed content - let ast2 = parse_template(&db, file); - assert!(ast2.is_some()); - - // Verify they're different Arcs (re-parsed) - assert!(!Arc::ptr_eq(&ast1.unwrap(), &ast2.unwrap())); - } - - #[test] - fn test_non_template_files_return_none() { - let db = Database::default(); - - // Create a Python file - let python_content: Arc = Arc::from("def hello():\n print('Hello')"); - let file = SourceFile::new(&db, FileKind::Python, python_content); - - // Should return None for non-template files - let ast = parse_template(&db, file); - assert!(ast.is_none()); - - // Errors should be empty for non-template files - let errors = template_errors(&db, file); - assert!(errors.is_empty()); - } - - #[test] - fn test_template_errors_tracked_separately() { - let db = Database::default(); - - // Create a template with an error (unclosed tag) - let template_content: Arc = Arc::from("{% if user %}Hello {{ user.name }"); - let file = SourceFile::new(&db, FileKind::Template, template_content); - - // Get errors - let errors1 = template_errors(&db, file); - let errors2 = template_errors(&db, file); - - // Should be cached (same Arc) - assert!(Arc::ptr_eq(&errors1, &errors2)); - } -} diff --git a/crates/djls-workspace/src/document.rs b/crates/djls-workspace/src/document.rs new file mode 100644 index 0000000..c1447af --- /dev/null +++ b/crates/djls-workspace/src/document.rs @@ -0,0 +1,216 @@ +use crate::language::LanguageId; +use crate::template::ClosingBrace; +use crate::template::TemplateTagContext; +use tower_lsp_server::lsp_types::Position; +use tower_lsp_server::lsp_types::Range; + +#[derive(Clone, Debug)] +pub struct TextDocument { + /// The document's content + content: String, + /// The version number of this document (from LSP) + version: i32, + /// The language identifier (python, htmldjango, etc.) + language_id: LanguageId, + /// Line index for efficient position lookups + line_index: LineIndex, +} + +impl TextDocument { + /// Create a new TextDocument with the given content + pub fn new(content: String, version: i32, language_id: LanguageId) -> Self { + let line_index = LineIndex::new(&content); + Self { + content, + version, + language_id, + line_index, + } + } + + /// Get the document's content + pub fn content(&self) -> &str { + &self.content + } + + /// Get the version number + pub fn version(&self) -> i32 { + self.version + } + + /// Get the language identifier + pub fn language_id(&self) -> LanguageId { + self.language_id.clone() + } + + pub fn line_index(&self) -> &LineIndex { + &self.line_index + } + + pub fn get_line(&self, line: u32) -> Option { + let line_start = *self.line_index.line_starts.get(line as usize)?; + let line_end = self + .line_index + .line_starts + .get(line as usize + 1) + .copied() + .unwrap_or(self.line_index.length); + + Some(self.content[line_start as usize..line_end as usize].to_string()) + } + + pub fn get_text_range(&self, range: Range) -> Option { + let start_offset = self.line_index.offset(range.start)? as usize; + let end_offset = self.line_index.offset(range.end)? as usize; + + Some(self.content[start_offset..end_offset].to_string()) + } + + /// Update the document content with LSP text changes + pub fn update( + &mut self, + changes: Vec, + version: i32, + ) { + // For now, we'll just handle full document updates + // TODO: Handle incremental updates + for change in changes { + // TextDocumentContentChangeEvent has a `text` field that's a String, not Option + self.content = change.text; + self.line_index = LineIndex::new(&self.content); + } + self.version = version; + } + + pub fn get_template_tag_context(&self, position: Position) -> Option { + let start = self.line_index.line_starts.get(position.line as usize)?; + let end = self + .line_index + .line_starts + .get(position.line as usize + 1) + .copied() + .unwrap_or(self.line_index.length); + + let line = &self.content[*start as usize..end as usize]; + let char_pos: usize = position.character.try_into().ok()?; + let prefix = &line[..char_pos]; + let rest_of_line = &line[char_pos..]; + let rest_trimmed = rest_of_line.trim_start(); + + prefix.rfind("{%").map(|tag_start| { + // Check if we're immediately after {% with no space + let needs_leading_space = prefix.ends_with("{%"); + + let closing_brace = if rest_trimmed.starts_with("%}") { + ClosingBrace::FullClose + } else if rest_trimmed.starts_with('}') { + ClosingBrace::PartialClose + } else { + ClosingBrace::None + }; + + TemplateTagContext { + partial_tag: prefix[tag_start + 2..].trim().to_string(), + needs_leading_space, + closing_brace, + } + }) + } + + pub fn position_to_offset(&self, position: Position) -> Option { + self.line_index.offset(position) + } + + pub fn offset_to_position(&self, offset: u32) -> Position { + self.line_index.position(offset) + } +} + +#[derive(Clone, Debug)] +pub struct LineIndex { + pub line_starts: Vec, + pub line_starts_utf16: Vec, + pub length: u32, + pub length_utf16: u32, +} + +impl LineIndex { + #[must_use] + pub fn new(text: &str) -> Self { + let mut line_starts = vec![0]; + let mut line_starts_utf16 = vec![0]; + let mut pos_utf8 = 0; + let mut pos_utf16 = 0; + + for c in text.chars() { + pos_utf8 += u32::try_from(c.len_utf8()).unwrap_or(0); + pos_utf16 += u32::try_from(c.len_utf16()).unwrap_or(0); + if c == '\n' { + line_starts.push(pos_utf8); + line_starts_utf16.push(pos_utf16); + } + } + + Self { + line_starts, + line_starts_utf16, + length: pos_utf8, + length_utf16: pos_utf16, + } + } + + #[must_use] + pub fn offset(&self, position: Position) -> Option { + let line_start = self.line_starts.get(position.line as usize)?; + + Some(line_start + position.character) + } + + /// Convert UTF-16 LSP position to UTF-8 byte offset + pub fn offset_utf16(&self, position: Position, text: &str) -> Option { + let line_start_utf8 = self.line_starts.get(position.line as usize)?; + let _line_start_utf16 = self.line_starts_utf16.get(position.line as usize)?; + + // If position is at start of line, return UTF-8 line start + if position.character == 0 { + return Some(*line_start_utf8); + } + + // Find the line text + let next_line_start = self + .line_starts + .get(position.line as usize + 1) + .copied() + .unwrap_or(self.length); + + let line_text = text.get(*line_start_utf8 as usize..next_line_start as usize)?; + + // Convert UTF-16 character offset to UTF-8 byte offset within the line + let mut utf16_pos = 0; + let mut utf8_pos = 0; + + for c in line_text.chars() { + if utf16_pos >= position.character { + break; + } + utf16_pos += u32::try_from(c.len_utf16()).unwrap_or(0); + utf8_pos += u32::try_from(c.len_utf8()).unwrap_or(0); + } + + Some(line_start_utf8 + utf8_pos) + } + + #[allow(dead_code)] + #[must_use] + pub fn position(&self, offset: u32) -> Position { + let line = match self.line_starts.binary_search(&offset) { + Ok(line) => line, + Err(line) => line - 1, + }; + + let line_start = self.line_starts[line]; + let character = offset - line_start; + + Position::new(u32::try_from(line).unwrap_or(0), character) + } +} diff --git a/crates/djls-workspace/src/document/line_index.rs b/crates/djls-workspace/src/document/line_index.rs deleted file mode 100644 index cbbf645..0000000 --- a/crates/djls-workspace/src/document/line_index.rs +++ /dev/null @@ -1,90 +0,0 @@ -use tower_lsp_server::lsp_types::Position; - -#[derive(Clone, Debug)] -pub struct LineIndex { - pub line_starts: Vec, - pub line_starts_utf16: Vec, - pub length: u32, - pub length_utf16: u32, -} - -impl LineIndex { - #[must_use] - pub fn new(text: &str) -> Self { - let mut line_starts = vec![0]; - let mut line_starts_utf16 = vec![0]; - let mut pos_utf8 = 0; - let mut pos_utf16 = 0; - - for c in text.chars() { - pos_utf8 += u32::try_from(c.len_utf8()).unwrap_or(0); - pos_utf16 += u32::try_from(c.len_utf16()).unwrap_or(0); - if c == '\n' { - line_starts.push(pos_utf8); - line_starts_utf16.push(pos_utf16); - } - } - - Self { - line_starts, - line_starts_utf16, - length: pos_utf8, - length_utf16: pos_utf16, - } - } - - #[must_use] - pub fn offset(&self, position: Position) -> Option { - let line_start = self.line_starts.get(position.line as usize)?; - - Some(line_start + position.character) - } - - /// Convert UTF-16 LSP position to UTF-8 byte offset - pub fn offset_utf16(&self, position: Position, text: &str) -> Option { - let line_start_utf8 = self.line_starts.get(position.line as usize)?; - let _line_start_utf16 = self.line_starts_utf16.get(position.line as usize)?; - - // If position is at start of line, return UTF-8 line start - if position.character == 0 { - return Some(*line_start_utf8); - } - - // Find the line text - let next_line_start = self - .line_starts - .get(position.line as usize + 1) - .copied() - .unwrap_or(self.length); - - let line_text = text.get(*line_start_utf8 as usize..next_line_start as usize)?; - - // Convert UTF-16 character offset to UTF-8 byte offset within the line - let mut utf16_pos = 0; - let mut utf8_pos = 0; - - for c in line_text.chars() { - if utf16_pos >= position.character { - break; - } - utf16_pos += u32::try_from(c.len_utf16()).unwrap_or(0); - utf8_pos += u32::try_from(c.len_utf8()).unwrap_or(0); - } - - Some(line_start_utf8 + utf8_pos) - } - - #[allow(dead_code)] - #[must_use] - pub fn position(&self, offset: u32) -> Position { - let line = match self.line_starts.binary_search(&offset) { - Ok(line) => line, - Err(line) => line - 1, - }; - - let line_start = self.line_starts[line]; - let character = offset - line_start; - - Position::new(u32::try_from(line).unwrap_or(0), character) - } -} diff --git a/crates/djls-workspace/src/document/mod.rs b/crates/djls-workspace/src/document/mod.rs deleted file mode 100644 index 93d443f..0000000 --- a/crates/djls-workspace/src/document/mod.rs +++ /dev/null @@ -1,130 +0,0 @@ -mod language; -mod line_index; -mod template; - -pub use language::LanguageId; -pub use line_index::LineIndex; -pub use template::ClosingBrace; -pub use template::TemplateTagContext; -use tower_lsp_server::lsp_types::Position; -use tower_lsp_server::lsp_types::Range; - -use crate::FileId; - -#[derive(Clone, Debug)] -pub struct TextDocument { - pub uri: String, - pub version: i32, - pub language_id: LanguageId, - pub(crate) file_id: FileId, - line_index: LineIndex, -} - -impl TextDocument { - pub(crate) fn new( - uri: String, - version: i32, - language_id: LanguageId, - file_id: FileId, - content: &str, - ) -> Self { - let line_index = LineIndex::new(content); - Self { - uri, - version, - language_id, - file_id, - line_index, - } - } - - pub(crate) fn file_id(&self) -> FileId { - self.file_id - } - - pub fn line_index(&self) -> &LineIndex { - &self.line_index - } - - pub fn get_content<'a>(&self, content: &'a str) -> &'a str { - content - } - - pub fn get_line(&self, content: &str, line: u32) -> Option { - let line_start = *self.line_index.line_starts.get(line as usize)?; - let line_end = self - .line_index - .line_starts - .get(line as usize + 1) - .copied() - .unwrap_or(self.line_index.length); - - Some(content[line_start as usize..line_end as usize].to_string()) - } - - pub fn get_text_range(&self, content: &str, range: Range) -> Option { - let start_offset = self.line_index.offset(range.start)? as usize; - let end_offset = self.line_index.offset(range.end)? as usize; - - Some(content[start_offset..end_offset].to_string()) - } - - pub fn get_template_tag_context( - &self, - content: &str, - position: Position, - ) -> Option { - let start = self.line_index.line_starts.get(position.line as usize)?; - let end = self - .line_index - .line_starts - .get(position.line as usize + 1) - .copied() - .unwrap_or(self.line_index.length); - - let line = &content[*start as usize..end as usize]; - let char_pos: usize = position.character.try_into().ok()?; - let prefix = &line[..char_pos]; - let rest_of_line = &line[char_pos..]; - let rest_trimmed = rest_of_line.trim_start(); - - prefix.rfind("{%").map(|tag_start| { - // Check if we're immediately after {% with no space - let needs_leading_space = prefix.ends_with("{%"); - - let closing_brace = if rest_trimmed.starts_with("%}") { - ClosingBrace::FullClose - } else if rest_trimmed.starts_with('}') { - ClosingBrace::PartialClose - } else { - ClosingBrace::None - }; - - TemplateTagContext { - partial_tag: prefix[tag_start + 2..].trim().to_string(), - closing_brace, - needs_leading_space, - } - }) - } - - pub fn position_to_offset(&self, position: Position) -> Option { - self.line_index.offset(position) - } - - pub fn offset_to_position(&self, offset: u32) -> Position { - self.line_index.position(offset) - } - - pub fn update_content(&mut self, content: &str) { - self.line_index = LineIndex::new(content); - } - - pub fn version(&self) -> i32 { - self.version - } - - pub fn language_id(&self) -> LanguageId { - self.language_id.clone() - } -} diff --git a/crates/djls-workspace/src/fs.rs b/crates/djls-workspace/src/fs.rs new file mode 100644 index 0000000..26603d4 --- /dev/null +++ b/crates/djls-workspace/src/fs.rs @@ -0,0 +1,269 @@ +//! File system abstraction following Ruff's pattern +//! +//! This module provides the `FileSystem` trait that abstracts file I/O operations. +//! This allows the LSP to work with both real files and in-memory overlays. + +use dashmap::DashMap; +use std::io; +use std::path::Path; +use std::sync::Arc; +use url::Url; + +use crate::document::TextDocument; + +/// Trait for file system operations +/// +/// This follows Ruff's pattern of abstracting file system operations behind a trait, +/// allowing different implementations for testing, in-memory operation, and real file access. +pub trait FileSystem: Send + Sync { + /// Read the entire contents of a file + fn read_to_string(&self, path: &Path) -> io::Result; + + /// Check if a path exists + fn exists(&self, path: &Path) -> bool; + + /// Check if a path is a file + fn is_file(&self, path: &Path) -> bool; + + /// Check if a path is a directory + fn is_directory(&self, path: &Path) -> bool; + + /// List directory contents + fn read_directory(&self, path: &Path) -> io::Result>; + + /// Get file metadata (size, modified time, etc.) + fn metadata(&self, path: &Path) -> io::Result; +} + +/// Standard file system implementation that uses `std::fs` +pub struct OsFileSystem; + +impl FileSystem for OsFileSystem { + fn read_to_string(&self, path: &Path) -> io::Result { + std::fs::read_to_string(path) + } + + fn exists(&self, path: &Path) -> bool { + path.exists() + } + + fn is_file(&self, path: &Path) -> bool { + path.is_file() + } + + fn is_directory(&self, path: &Path) -> bool { + path.is_dir() + } + + fn read_directory(&self, path: &Path) -> io::Result> { + std::fs::read_dir(path)? + .map(|entry| entry.map(|e| e.path())) + .collect() + } + + fn metadata(&self, path: &Path) -> io::Result { + std::fs::metadata(path) + } +} + +/// LSP file system that intercepts reads for overlay files +/// +/// This implements Ruff's two-layer architecture where Layer 1 (LSP overlays) +/// takes precedence over Layer 2 (Salsa database). When a file is read, +/// this system first checks for an overlay (in-memory changes) and returns +/// that content. If no overlay exists, it falls back to reading from disk. +pub struct WorkspaceFileSystem { + /// In-memory overlays that take precedence over disk files + /// Maps URL to `TextDocument` containing current content + buffers: Arc>, + /// Fallback file system for disk operations + disk: Arc, +} + +impl WorkspaceFileSystem { + /// Create a new [`LspFileSystem`] with the given overlay storage and fallback + pub fn new(buffers: Arc>, disk: Arc) -> Self { + Self { buffers, disk } + } +} + +impl FileSystem for WorkspaceFileSystem { + fn read_to_string(&self, path: &Path) -> io::Result { + if let Some(document) = path_to_url(path).and_then(|url| self.buffers.get(&url)) { + Ok(document.content().to_string()) + } else { + self.disk.read_to_string(path) + } + } + + fn exists(&self, path: &Path) -> bool { + path_to_url(path).is_some_and(|url| self.buffers.contains_key(&url)) + || self.disk.exists(path) + } + + fn is_file(&self, path: &Path) -> bool { + path_to_url(path).is_some_and(|url| self.buffers.contains_key(&url)) + || self.disk.is_file(path) + } + + fn is_directory(&self, path: &Path) -> bool { + // Overlays are never directories, so just delegate + self.disk.is_directory(path) + } + + fn read_directory(&self, path: &Path) -> io::Result> { + // Overlays are never directories, so just delegate + self.disk.read_directory(path) + } + + fn metadata(&self, path: &Path) -> io::Result { + // For overlays, we could synthesize metadata, but for simplicity, + // fall back to disk. This might need refinement for edge cases. + self.disk.metadata(path) + } +} + +/// Convert a file path to URL for overlay lookup +/// +/// This is a simplified conversion - in a full implementation, +/// you might want more robust path-to-URL conversion +fn path_to_url(path: &Path) -> Option { + if let Ok(absolute_path) = std::fs::canonicalize(path) { + return Url::from_file_path(absolute_path).ok(); + } + + // For test scenarios where the file doesn't exist on disk, + // try to create URL from the path directly if it's absolute + if path.is_absolute() { + return Url::from_file_path(path).ok(); + } + + None +} + +#[cfg(test)] +mod tests { + use super::*; + use crate::document::TextDocument; + use crate::language::LanguageId; + + /// In-memory file system for testing + pub struct InMemoryFileSystem { + files: std::collections::HashMap, + } + + impl InMemoryFileSystem { + pub fn new() -> Self { + Self { + files: std::collections::HashMap::new(), + } + } + + pub fn add_file(&mut self, path: std::path::PathBuf, content: String) { + self.files.insert(path, content); + } + } + + impl FileSystem for InMemoryFileSystem { + fn read_to_string(&self, path: &Path) -> io::Result { + self.files + .get(path) + .cloned() + .ok_or_else(|| io::Error::new(io::ErrorKind::NotFound, "File not found")) + } + + fn exists(&self, path: &Path) -> bool { + self.files.contains_key(path) + } + + fn is_file(&self, path: &Path) -> bool { + self.files.contains_key(path) + } + + fn is_directory(&self, _path: &Path) -> bool { + // Simplified for testing - no directories in memory filesystem + false + } + + fn read_directory(&self, _path: &Path) -> io::Result> { + // Simplified for testing + Ok(Vec::new()) + } + + fn metadata(&self, _path: &Path) -> io::Result { + Err(io::Error::new( + io::ErrorKind::Unsupported, + "Metadata not supported in memory filesystem", + )) + } + } + + #[test] + fn test_lsp_filesystem_overlay_precedence() { + // Create a memory filesystem with some content + let mut memory_fs = InMemoryFileSystem::new(); + memory_fs.add_file( + std::path::PathBuf::from("/test/file.py"), + "original content".to_string(), + ); + + // Create overlay storage + let overlays = Arc::new(DashMap::new()); + + // Create LspFileSystem with memory fallback + let lsp_fs = WorkspaceFileSystem::new(overlays.clone(), Arc::new(memory_fs)); + + // Before adding overlay, should read from fallback + let path = std::path::Path::new("/test/file.py"); + assert_eq!(lsp_fs.read_to_string(path).unwrap(), "original content"); + + // Add overlay - this simulates having an open document with changes + let url = Url::from_file_path("/test/file.py").unwrap(); + let document = TextDocument::new("overlay content".to_string(), 1, LanguageId::Python); + overlays.insert(url, document); + + // Now should read from overlay + assert_eq!(lsp_fs.read_to_string(path).unwrap(), "overlay content"); + } + + #[test] + fn test_lsp_filesystem_fallback_when_no_overlay() { + // Create memory filesystem + let mut memory_fs = InMemoryFileSystem::new(); + memory_fs.add_file( + std::path::PathBuf::from("/test/file.py"), + "disk content".to_string(), + ); + + // Create empty overlay storage + let overlays = Arc::new(DashMap::new()); + + // Create LspFileSystem + let lsp_fs = WorkspaceFileSystem::new(overlays, Arc::new(memory_fs)); + + // Should fall back to disk when no overlay exists + let path = std::path::Path::new("/test/file.py"); + assert_eq!(lsp_fs.read_to_string(path).unwrap(), "disk content"); + } + + #[test] + fn test_lsp_filesystem_other_operations_delegate() { + // Create memory filesystem + let mut memory_fs = InMemoryFileSystem::new(); + memory_fs.add_file( + std::path::PathBuf::from("/test/file.py"), + "content".to_string(), + ); + + // Create LspFileSystem + let overlays = Arc::new(DashMap::new()); + let lsp_fs = WorkspaceFileSystem::new(overlays, Arc::new(memory_fs)); + + let path = std::path::Path::new("/test/file.py"); + + // These should delegate to the fallback filesystem + assert!(lsp_fs.exists(path)); + assert!(lsp_fs.is_file(path)); + assert!(!lsp_fs.is_directory(path)); + } +} diff --git a/crates/djls-workspace/src/document/language.rs b/crates/djls-workspace/src/language.rs similarity index 79% rename from crates/djls-workspace/src/document/language.rs rename to crates/djls-workspace/src/language.rs index 65c322a..8db778f 100644 --- a/crates/djls-workspace/src/document/language.rs +++ b/crates/djls-workspace/src/language.rs @@ -2,8 +2,10 @@ use crate::FileKind; #[derive(Clone, Debug, PartialEq)] pub enum LanguageId { + Html, HtmlDjango, Other, + PlainText, Python, } @@ -11,6 +13,8 @@ impl From<&str> for LanguageId { fn from(language_id: &str) -> Self { match language_id { "django-html" | "htmldjango" => Self::HtmlDjango, + "html" => Self::Html, + "plaintext" => Self::PlainText, "python" => Self::Python, _ => Self::Other, } @@ -28,7 +32,7 @@ impl From for FileKind { match language_id { LanguageId::Python => Self::Python, LanguageId::HtmlDjango => Self::Template, - LanguageId::Other => Self::Other, + LanguageId::Html | LanguageId::PlainText | LanguageId::Other => Self::Other, } } } diff --git a/crates/djls-workspace/src/lib.rs b/crates/djls-workspace/src/lib.rs index 9fbb34f..22e0faf 100644 --- a/crates/djls-workspace/src/lib.rs +++ b/crates/djls-workspace/src/lib.rs @@ -1,25 +1,13 @@ -mod bridge; pub mod db; mod document; -mod lsp_system; -mod system; +mod fs; +mod language; +mod template; pub use db::Database; -pub use document::{TextDocument, LanguageId}; -pub use system::{FileSystem, StdFileSystem}; - -/// File classification for routing to analyzers. -/// -/// [`FileKind`] determines how a file should be processed by downstream analyzers. -#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)] -pub enum FileKind { - /// Python source file - Python, - /// Django template file - Template, - /// Other file type - Other, -} +pub use document::TextDocument; +pub use fs::{FileSystem, OsFileSystem, WorkspaceFileSystem}; +pub use language::LanguageId; /// Stable, compact identifier for files across the subsystem. /// @@ -43,3 +31,28 @@ impl FileId { self.0 } } + +/// File classification for routing to analyzers. +/// +/// [`FileKind`] determines how a file should be processed by downstream analyzers. +#[derive(Copy, Clone, Eq, PartialEq, Hash, Debug)] +pub enum FileKind { + /// Python source file + Python, + /// Django template file + Template, + /// Other file type + Other, +} + +impl FileKind { + /// Determine `FileKind` from a file path extension. + #[must_use] + pub fn from_path(path: &std::path::Path) -> Self { + match path.extension().and_then(|s| s.to_str()) { + Some("py") => FileKind::Python, + Some("html" | "htm") => FileKind::Template, + _ => FileKind::Other, + } + } +} diff --git a/crates/djls-workspace/src/lsp_system.rs b/crates/djls-workspace/src/lsp_system.rs deleted file mode 100644 index b03c8e8..0000000 --- a/crates/djls-workspace/src/lsp_system.rs +++ /dev/null @@ -1,154 +0,0 @@ -//! LSP-aware file system wrapper that handles overlays -//! -//! This is the KEY pattern from Ruff - the LspSystem wraps a FileSystem -//! and intercepts reads to check for overlays first. This allows unsaved -//! changes to be used without going through Salsa. - -use std::collections::HashMap; -use std::io; -use std::path::{Path, PathBuf}; -use std::sync::Arc; - -use url::Url; - -use crate::system::FileSystem; - -/// LSP-aware file system that checks overlays before disk -/// -/// This is the critical piece that makes overlays work efficiently in Ruff's -/// architecture. Instead of updating Salsa for every keystroke, we intercept -/// file reads here and return overlay content when available. -pub struct LspSystem { - /// The underlying file system (usually StdFileSystem) - inner: Arc, - - /// Map of open document URLs to their overlay content - overlays: HashMap, -} - -impl LspSystem { - /// Create a new LspSystem wrapping the given file system - pub fn new(file_system: Arc) -> Self { - Self { - inner: file_system, - overlays: HashMap::new(), - } - } - - /// Set overlay content for a document - pub fn set_overlay(&mut self, url: Url, content: String) { - self.overlays.insert(url, content); - } - - /// Remove overlay content for a document - pub fn remove_overlay(&mut self, url: &Url) { - self.overlays.remove(url); - } - - /// Check if a document has an overlay - pub fn has_overlay(&self, url: &Url) -> bool { - self.overlays.contains_key(url) - } - - /// Get overlay content if it exists - pub fn get_overlay(&self, url: &Url) -> Option<&String> { - self.overlays.get(url) - } - - /// Convert a URL to a file path - fn url_to_path(url: &Url) -> Option { - if url.scheme() == "file" { - url.to_file_path().ok().or_else(|| { - // Fallback for simple conversion - Some(PathBuf::from(url.path())) - }) - } else { - None - } - } -} - -impl FileSystem for LspSystem { - fn read_to_string(&self, path: &Path) -> io::Result { - // First check if we have an overlay for this path - // Convert path to URL for lookup - let url = Url::from_file_path(path) - .map_err(|_| io::Error::new(io::ErrorKind::InvalidInput, "Invalid path"))?; - - if let Some(content) = self.overlays.get(&url) { - // Return overlay content instead of reading from disk - return Ok(content.clone()); - } - - // No overlay, read from underlying file system - self.inner.read_to_string(path) - } - - fn exists(&self, path: &Path) -> bool { - // Check overlays first - if let Ok(url) = Url::from_file_path(path) { - if self.overlays.contains_key(&url) { - return true; - } - } - - self.inner.exists(path) - } - - fn is_file(&self, path: &Path) -> bool { - // Overlays are always files - if let Ok(url) = Url::from_file_path(path) { - if self.overlays.contains_key(&url) { - return true; - } - } - - self.inner.is_file(path) - } - - fn is_directory(&self, path: &Path) -> bool { - // Overlays are never directories - if let Ok(url) = Url::from_file_path(path) { - if self.overlays.contains_key(&url) { - return false; - } - } - - self.inner.is_directory(path) - } - - fn read_directory(&self, path: &Path) -> io::Result> { - // Overlays don't affect directory listings - self.inner.read_directory(path) - } - - fn metadata(&self, path: &Path) -> io::Result { - // Can't provide metadata for overlays - self.inner.metadata(path) - } -} - -/// Extension trait for working with URL-based overlays -pub trait LspSystemExt { - /// Read file content by URL, checking overlays first - fn read_url(&self, url: &Url) -> io::Result; -} - -impl LspSystemExt for LspSystem { - fn read_url(&self, url: &Url) -> io::Result { - // Check overlays first - if let Some(content) = self.overlays.get(url) { - return Ok(content.clone()); - } - - // Convert URL to path and read from file system - if let Some(path_buf) = Self::url_to_path(url) { - self.inner.read_to_string(&path_buf) - } else { - Err(io::Error::new( - io::ErrorKind::InvalidInput, - format!("Cannot convert URL to path: {}", url), - )) - } - } -} \ No newline at end of file diff --git a/crates/djls-workspace/src/system.rs b/crates/djls-workspace/src/system.rs deleted file mode 100644 index 04a1b8a..0000000 --- a/crates/djls-workspace/src/system.rs +++ /dev/null @@ -1,118 +0,0 @@ -//! File system abstraction following Ruff's pattern -//! -//! This module provides the FileSystem trait that abstracts file I/O operations. -//! This allows the LSP to work with both real files and in-memory overlays. - -use std::io; -use std::path::Path; - -/// Trait for file system operations -/// -/// This follows Ruff's pattern of abstracting file system operations behind a trait, -/// allowing different implementations for testing, in-memory operation, and real file access. -pub trait FileSystem: Send + Sync { - /// Read the entire contents of a file - fn read_to_string(&self, path: &Path) -> io::Result; - - /// Check if a path exists - fn exists(&self, path: &Path) -> bool; - - /// Check if a path is a file - fn is_file(&self, path: &Path) -> bool; - - /// Check if a path is a directory - fn is_directory(&self, path: &Path) -> bool; - - /// List directory contents - fn read_directory(&self, path: &Path) -> io::Result>; - - /// Get file metadata (size, modified time, etc.) - fn metadata(&self, path: &Path) -> io::Result; -} - -/// Standard file system implementation that uses std::fs -pub struct StdFileSystem; - -impl FileSystem for StdFileSystem { - fn read_to_string(&self, path: &Path) -> io::Result { - std::fs::read_to_string(path) - } - - fn exists(&self, path: &Path) -> bool { - path.exists() - } - - fn is_file(&self, path: &Path) -> bool { - path.is_file() - } - - fn is_directory(&self, path: &Path) -> bool { - path.is_dir() - } - - fn read_directory(&self, path: &Path) -> io::Result> { - let mut entries = Vec::new(); - for entry in std::fs::read_dir(path)? { - entries.push(entry?.path()); - } - Ok(entries) - } - - fn metadata(&self, path: &Path) -> io::Result { - std::fs::metadata(path) - } -} - -/// In-memory file system for testing -#[cfg(test)] -pub struct MemoryFileSystem { - files: std::collections::HashMap, -} - -#[cfg(test)] -impl MemoryFileSystem { - pub fn new() -> Self { - Self { - files: std::collections::HashMap::new(), - } - } - - pub fn add_file(&mut self, path: std::path::PathBuf, content: String) { - self.files.insert(path, content); - } -} - -#[cfg(test)] -impl FileSystem for MemoryFileSystem { - fn read_to_string(&self, path: &Path) -> io::Result { - self.files - .get(path) - .cloned() - .ok_or_else(|| io::Error::new(io::ErrorKind::NotFound, "File not found")) - } - - fn exists(&self, path: &Path) -> bool { - self.files.contains_key(path) - } - - fn is_file(&self, path: &Path) -> bool { - self.files.contains_key(path) - } - - fn is_directory(&self, _path: &Path) -> bool { - // Simplified for testing - no directories in memory filesystem - false - } - - fn read_directory(&self, _path: &Path) -> io::Result> { - // Simplified for testing - Ok(Vec::new()) - } - - fn metadata(&self, _path: &Path) -> io::Result { - Err(io::Error::new( - io::ErrorKind::Unsupported, - "Metadata not supported in memory filesystem", - )) - } -} \ No newline at end of file diff --git a/crates/djls-workspace/src/document/template.rs b/crates/djls-workspace/src/template.rs similarity index 100% rename from crates/djls-workspace/src/document/template.rs rename to crates/djls-workspace/src/template.rs diff --git a/crates/djls-workspace/src/test_db.rs b/crates/djls-workspace/src/test_db.rs deleted file mode 100644 index 92683d5..0000000 --- a/crates/djls-workspace/src/test_db.rs +++ /dev/null @@ -1,25 +0,0 @@ -//! Test module to explore Salsa thread safety - -#[cfg(test)] -mod tests { - use crate::db::Database; - use std::thread; - - #[test] - fn test_database_clone() { - let db = Database::new(); - let _db2 = db.clone(); - println!("✅ Database can be cloned"); - } - - #[test] - #[ignore] // This will fail - fn test_database_send() { - let db = Database::new(); - let db2 = db.clone(); - - thread::spawn(move || { - let _ = db2; - }).join().unwrap(); - } -} diff --git a/task_order.md b/task_order.md new file mode 100644 index 0000000..137a402 --- /dev/null +++ b/task_order.md @@ -0,0 +1,61 @@ +# Revised Task Order for Ruff Pattern Implementation + +## The Correct Architecture Understanding + +Based on Ruff expert clarification: +- **SourceFile should NOT store text content** (our current implementation is wrong) +- **File content is read on-demand** through a `source_text` tracked function +- **Overlays are never Salsa inputs**, they're read through FileSystem +- **File revision triggers invalidation**, not content changes + +## Implementation Order + +### Phase 1: Database Foundation +1. **task-129** - Complete Database FileSystem integration + - Database needs access to LspFileSystem to read files + - This enables tracked functions to read through FileSystem + +### Phase 2: Salsa Input Restructuring +2. **task-126** - Bridge Salsa queries to LspFileSystem + - Remove `text` field from SourceFile + - Add `path` and `revision` fields + - Create `source_text` tracked function + +### Phase 3: Query Updates +3. **task-95** - Update template parsing to use source_text query + - Update all queries to use `source_text(db, file)` + - Remove direct text access from SourceFile + +### Phase 4: LSP Integration +4. **task-112** - Add file revision tracking + - Bump file revision when overlays change + - This triggers Salsa invalidation + +### Phase 5: Testing +5. **task-127** - Test overlay behavior and Salsa integration + - Verify overlays work correctly + - Test invalidation behavior + +## Key Changes from Current Implementation + +Current (WRONG): +```rust +#[salsa::input] +pub struct SourceFile { + pub text: Arc, // ❌ Storing content in Salsa +} +``` + +Target (RIGHT): +```rust +#[salsa::input] +pub struct SourceFile { + pub path: PathBuf, + pub revision: u32, // ✅ Only track changes +} + +#[salsa::tracked] +pub fn source_text(db: &dyn Db, file: SourceFile) -> Arc { + // Read through FileSystem (checks overlays first) +} +```