Commit graph

781 commits

Author SHA1 Message Date
pedrocarlo
da33f421b3 add readonly checks to ensure we do not change the header 2025-12-18 10:54:58 -03:00
Jussi Saurio
eb59bc6ae6 Merge 'Enable MVCC with PRAGMA journal_mode' from Pedro Muniz
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Closes #3536
# Description
This PR implements **dynamic journal mode switching** via `PRAGMA
journal_mode`, allowing users to switch between WAL and MVCC modes at
runtime.
### Key Changes
**Core Feature: Journal Mode Switching**
- Added new `JournalMode` module (`core/storage/journal_mode.rs`) to
parse and handle journal mode transitions
- Modified `op_journal_mode` to correctly parse journal modes and update
the database header
- Emit checkpoint when setting a new journal mode to ensure data
consistency
- Added MVCC checkpoint support to `Connection::checkpoint`
**Database Initialization Improvements**
- Read DB header on `Database::open` and simplified `init_pager`
- Made `Version` an enum for better comparison semantics
- Automatically convert legacy SQLite databases to WAL mode
- Ensure DB header is flushed to disk when header changes during open
- Clear page cache after header validation
**Bug Fixes**
- Fixed dirty page invalidation in pager when clearing dirty pages in
page cache
- Fixed `is_none_or` check for row version existence in DB file (handles
MvStore initialization and empty database cases)
- Added `btree_resident` field in `RowVersion` to track if
insert/deletion originated from a btree
**Testing**
- Added fuzz tests for `journal_mode` transitions with Database
operations in between
- Added integration tests for testing switching from the different modes
while checking the header version is correct
- Added some specific regression tests for delete operations lost on
mode switch
- Fixed `index_scan_compound_key_fuzz` to use separate databases for
Turso and SQLite in MVCC mode. Also had to decrease number of rows for
MVCC test, as insert was very slow.
# TODO's
- Remove sync hacks from `op_journal_mode`
- Expand fuzzer with different queries
- Add to Simulator
- Special handling for read only databases and not allow any header
changes
# Motivation and context
Facilitate our users to test MVCC and transition back and forth from it.
# AI Disclosure
Used AI to catch and fix bugs in MVCC, further my understanding with
MVCC, write tests in `tests` folder, most of the PR summary, and the
docs in the `docs/manual.md` file

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4074
2025-12-17 21:03:56 +02:00
PThorpe92
db60db7d9b
impl cache spilling in pager 2025-12-17 11:57:34 -05:00
pedrocarlo
5242c4017c print warning message correctly 2025-12-17 10:59:04 -03:00
pedrocarlo
18b810bb04 adjust cli to Print warnings + print a warning about not converting to MVCC mode with WAL file 2025-12-17 10:55:25 -03:00
pedrocarlo
257dc5ad09 do not initiate a write transaction for journal mode + checkpoint before changing mode 2025-12-17 10:55:24 -03:00
pedrocarlo
cb2a983a7d add mvcc checkpoint to Connection::checkpoint 2025-12-17 10:55:24 -03:00
pedrocarlo
323f1152d8 emit Checkpoint when setting new journal mode + adjust init code to correctly open the mv store 2025-12-17 10:55:24 -03:00
pedrocarlo
0cbe904cef ensure DB header is flushed to DB file if header changes during DB open 2025-12-17 10:55:24 -03:00
pedrocarlo
d353bcf844 clippy 2025-12-17 10:55:24 -03:00
pedrocarlo
c907a6d57f automatically convert Legacy sqlite db to WAL mode 2025-12-17 10:55:24 -03:00
pedrocarlo
3e71bc9b10 clear page cache after header validation 2025-12-17 10:55:24 -03:00
pedrocarlo
9c0f40a28a read DB header on Database open + simplify init_pager 2025-12-17 10:55:24 -03:00
pedrocarlo
00cbae8412 modify op_journal_mode to parse the correct jounral mode and modify the header 2025-12-17 10:49:25 -03:00
Jussi Saurio
f613b24ebe refactor(pager): remove two-phase checkpoint API, add blocking_checkpoint
- Remove wal_checkpoint_start() function
- Remove wal_checkpoint_finish() function
- Remove wal_checkpoint() function
- Add blocking_checkpoint() as a convenience wrapper around checkpoint()
- Update checkpoint_shutdown() to take sync_mode parameter and use blocking_checkpoint()

The checkpoint() function now handles all the checkpoint state machine logic
internally, so the two-phase API is no longer needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 12:30:40 +02:00
Pekka Enberg
4bad0f8d59 core: Make Pager thread-safe 2025-12-16 19:50:03 +02:00
Jussi Saurio
4a57710640
fix/core: decouple autocheckpoint result from transaction durability
After flushing the WAL, the db may attempt an automatic checkpoint. Previously,
a checkpoint failure for any other reason than `Busy` rolled back the transaction;
this is incorrect because the WAL has already been written and synced, so the transaction
is actually committed and durable.

For this reason, do the following decoupling:

- Mark the transaction as committed after `SyncWal`
- Any errors beyond that point are wrapped as `LimboError::CheckpointFailed` which is
  handled specially:
    * Tx rollback is not attempted in `abort()` - only the WAL locks are cleaned up
      and checkpoint state machines are reset
    * In the simulator, the results of the query are shadowed into the sim environment
      so that the sim correctly assumes that the transaction's effects were in fact
      committed.
2025-12-16 10:42:48 -05:00
Pekka Enberg
98308415b4 core: Don't rollback transaction when schema updated
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.

Spotted by Whopper.
2025-12-15 13:49:21 +02:00
Preston Thorpe
53b9a12fd4
Merge 'Remove unused parameter in limbo_exec_rows and add ergonomic ExecRows trait for testing' from Pedro Muniz
`ExecRows` trait should allow us to do something like this when testing:
```rust
let rows: Vec<(String,)> = conn.exec_rows("SELECT val FROM t ORDER BY val");
```
Which just makes your life easier overall so we don't have to constantly
repeat the `.step` loop

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4164
2025-12-10 14:37:36 -05:00
pedrocarlo
ffbbd4c270 add exec rows trait for more ergonomic testing in core_tester 2025-12-10 15:21:03 -03:00
PThorpe92
298bb3caf7
Update stats refresh in core/lib 2025-12-09 14:46:12 -05:00
PThorpe92
b002350a8e
work on finishing ANALYZE impl 2025-12-09 14:46:11 -05:00
Jussi Saurio
826ca4d44d chore: remove experimental_indexes feature flags 2025-12-08 13:00:37 +02:00
Jussi Saurio
fc5366ab1d
Merge 'do not propagate the MvStore to opcodes' from Pedro Muniz
In my adventure of doing #3536, I had a bug where an opcode was using a
stale MvStore reference when transitioning from MvStore to Wal mode. I
did not want to special case just 1 opcode to ignore the `mv_store`
argument (in my particular case it was `Insn::Checkpoint`), so I just
edited it and forced the opcodes to get `mv_store` from the
`program.connection.mv_store()` which is always up to date.
But, if we don't want to make this change I can always just do the
special case and move on.
**AI Disclosure:**
Asked Claude to do most of refactoring, but all the changes were
manually approved by me.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4126
2025-12-08 06:22:00 +02:00
pedrocarlo
008fdab4b8 do not propagate the MvStore to opcodes 2025-12-07 23:02:42 -03:00
pedrocarlo
5df5ba72de remove unnecessary initialized checks 2025-12-05 19:47:05 -03:00
pedrocarlo
f6668ce0e2 do not create a new pager, nor close the sharedwal when setting the page_size 2025-12-05 19:44:28 -03:00
pedrocarlo
e785124e8f remove DBState in favor of using init_page_1 to track initialization 2025-12-05 15:26:44 -03:00
pedrocarlo
4fa305fe82 propagate init_page_1 to the Pager 2025-12-05 15:26:44 -03:00
pedrocarlo
da3da023dd create a default page1 on database init 2025-12-05 15:26:18 -03:00
Jussi Saurio
74296e52bb
Merge 'Automatically Propagate Encryption options' from Pedro Muniz
On database open, we store the Encryption Options and pass them onwards
to the Connection, Pager and Wal. We also have slight gain in
ergonomics, as we don't have set the Pragma's for the `cipher` and
`hexkey` on each new `Connection`.
I needed this logic, because I will need to initialize a Default Header
for empty DBs and encryption opts not being automatically propagated was
hindering me for this.
**Ai Disclosure**
Claude helped me debug and find out issues in my implementation
cc @avinassh

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #4100
2025-12-05 15:31:17 +02:00
pedrocarlo
ee73bab743 get correct reserved bytes if Cipher is not None 2025-12-05 02:04:06 -03:00
pedrocarlo
a311c966a2 set encryption context for page and wal in init_pager 2025-12-05 02:04:06 -03:00
pedrocarlo
889322f6b5 do not call pragmas related to encryption on connect or open 2025-12-05 02:04:06 -03:00
pedrocarlo
0118a65169 pass encryption opts from the database to the connection on connect 2025-12-05 02:04:06 -03:00
pedrocarlo
85b212056d separate init function for connect 2025-12-05 02:04:06 -03:00
pedrocarlo
1a43de35ce add encryption key and cipher to Database struct 2025-12-05 02:04:06 -03:00
pedrocarlo
faca85de2f pass pager to _connect and share initial coon for boostrapping mvcc 2025-12-05 02:04:05 -03:00
Nikita Sivukhin
510a61b5eb Merge branch 'main' into sync-sdk-kit 2025-12-03 21:16:15 +04:00
pedrocarlo
e26c663616 do not pass mv store if we are in a bootstrap connection 2025-12-03 10:10:02 -03:00
pedrocarlo
02c2a63d8e use ArcSwap to store MvStore 2025-12-03 10:09:04 -03:00
pedrocarlo
11a40e7e64 do not store MvStore in Statements. Always get them from database 2025-12-03 10:09:04 -03:00
Jussi Saurio
b7d4aa06a5
Merge 'mvcc: implement logical log recovery for indexes + checkpointing of indexes' from Jussi Saurio
## Beef
- Change logical log format to also allow index row records - this is
for simplicity during recovery and may change later.
- Checkpoint indexes and index writes to the DB file
- Fix issues related to deletes - it's possible MV store has no row
versions for a row that exists in the DB file, so we need to add a
tombstone row version in that case, and we must fetch that row's data
from the btree to be able to include the data in the row version
- fix some miscellaneous logic bugs

Closes #4067
2025-12-03 10:08:24 +02:00
Jussi Saurio
265bd74c99 reparse_schema: conditionally read from mv store or not 2025-12-02 11:38:05 +02:00
Nikita Sivukhin
65ec20a562 small renames 2025-12-01 22:53:39 +04:00
Nikita Sivukhin
73a94910d8 Merge branch 'main' into sdk-kit 2025-11-28 02:56:01 +04:00
Nikita Sivukhin
ef3db24a49 rename methods in core a little bit 2025-11-27 14:12:47 +04:00
Jussi Saurio
c433a782b7 mvcc: allow use of indexes
yeah they are broken still, but i don't want to add temporary overrides
2025-11-26 09:05:23 +02:00
Jussi Saurio
610d8cc3ba
Merge 'introduce program execution state in order to run stmt to completion in case of finalize or reset' from Nikita Sivukhin
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / test (push) Blocked by required conditions
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / build-native (macos-latest) (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
This PR introduces program execution state in order for statement to be
aware of its state - is it terminal (Done, Failed, Interrupted) or not.
The particular problem right now is that statements like `INSERT INTO t
VALUES (1), (2), (3) RETURNING x` will execute inserts one by one and
interleave them with rows generation. This means that if statement
consumer will just read one row and then finalize the statement -
nothing will be actually committed (because transaction will be
aborted).
In order to quickly mitigate this issue - program state is introduced
which can help to decide what to do in the finalize.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4038
2025-11-25 14:47:56 +02:00
Nikita Sivukhin
e39e60ef18 introduce program execution state in order to run stmt to completion in case of finalize or reset 2025-11-25 11:14:20 +04:00