limbo

mirror of https://github.com/tursodatabase/limbo.git synced 2025-12-23 08:21:09 +00:00

Author	SHA1	Message	Date
Preston Thorpe	2e794934dd	Merge 'Use u64::from instead of .into()' from Elina ## Description In certain edge cases, Into::into type inference fails and causes library to fail to build in downstream crates ## Motivation and context See https://github.com/ranile/turso-compile-fail In this case, patching chrono to work with `ic-cdk` fails compilation of turso. `ic-cdk` adds a dependency on `candid`, which provides some trait implementations that make Into::into type inference break. <details><summary>Error message</summary> ``` error[E0283]: type annotations needed --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:216:71 \| 216 \| .is_some_and(\|txid_max_old\| b <= txid_max_old.into()) \| -- ^^^^ \| \| \| type must be known at this point \| = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`: - impl PartialOrd for u64; - impl PartialOrd<candid::types::number::Int> for u64; - impl PartialOrd<candid::types::number::Nat> for u64; help: try using a fully qualified path to specify the expected types \| 216 - .is_some_and(\|txid_max_old\| b <= txid_max_old.into()) 216 + .is_some_and(\|txid_max_old\| b <= <std::num::NonZero<u64> as Into<T>>::into(txid_max_old)) \| error[E0283]: type annotations needed --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:226:67 \| 226 \| .is_some_and(\|txid_max_old\| e <= txid_max_old.into()) \| -- ^^^^ \| \| \| type must be known at this point \| = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`: - impl PartialOrd for u64; - impl PartialOrd<candid::types::number::Int> for u64; - impl PartialOrd<candid::types::number::Nat> for u64; help: try using a fully qualified path to specify the expected types \| 226 - .is_some_and(\|txid_max_old\| e <= txid_max_old.into()) 226 + .is_some_and(\|txid_max_old\| e <= <std::num::NonZero<u64> as Into<T>>::into(txid_max_old)) \| error[E0283]: type annotations needed --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:242:90 \| 242 \| .is_none_or(\|txid_max_old\| begin_ts.is_some_and(\|b\| b > txid_max_old.into())); \| - ^^^^ \| \| \| type must be known at this point \| = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`: - impl PartialOrd for u64; - impl PartialOrd<candid::types::number::Int> for u64; - impl PartialOrd<candid::types::number::Nat> for u64; help: try using a fully qualified path to specify the expected types \| 242 - .is_none_or(\|txid_max_old\| begin_ts.is_some_and(\|b\| b > txid_max_old.into())); 242 + .is_none_or(\|txid_max_old\| begin_ts.is_some_and(\|b\| b > <std::num::NonZero<u64> as Into<T>>::into(txid_max_old))); \| For more information about this error, try `rustc --explain E0283`. error: could not compile `turso_core` (lib) due to 3 previous errors ``` </details> ## Description of AI Usage All code is hand-written Reviewed-by: Nikita Sivukhin (@sivukhin) Closes #4293	2025-12-19 14:46:33 -05:00
Pere Diaz Bou	2dec8a6e00	Merge ' core/execute: use same code for generating rowid in mvcc as in btree' from Pere Diaz Bou ## Description in `op_new_rowid` we already have code logic that encodes how to get the last rowid correctly, this PR uses advantage of it in MVCC too but with a few `lock` guards in place to not collide rowids ## Motivation and context It is hard to maintain two ways of getting a new rowid so this tries to fold mvcc with btree ## Description of AI Usage None Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4228	2025-12-19 16:45:27 +01:00
Pere Diaz Bou	95cc293508	core/mvcc: Fix typos and use turso_assert Use turso_assert to check cursor state, clean up lock handling, and rename fields from last_rowid/intialized to max_rowid/initialized	2025-12-19 12:52:01 +01:00
Pekka Enberg	edd45ff7b8	Improve MVCC DX by dropping `--experimental-mvcc` flag The DX is right now pretty terrible: ``` penberg@vonneumann turso % cargo run -- hello.db Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s Running `target/debug/tursodb hello.db` Turso v0.4.0-pre.18 Enter ".help" for usage hints. Did you know that Turso supports live materialized views? Type .manual materialized-views to learn more. This software is in BETA, use caution with production data and ensure you have backups. turso> PRAGMA journal_mode = 'experimental_mvcc'; × Invalid argument supplied: MVCC is not enabled. Enable it with `--experimental-mvcc` flag in the CLI or by setting the MVCC option in `DatabaseOpts` turso> ``` To add insult to the injury, many SDKs don't even have a way to enable MVCC via database options. Therefore, let's remove the flag altogether.	2025-12-19 12:59:42 +02:00
Pere Diaz Bou	5b4bcfa508	core/mvcc: fix clippy issues	2025-12-19 11:26:05 +01:00
Pere Diaz Bou	77ab8c9085	core/mvcc: introduce RowidAllocator RowidAllocator is a centralized lock protected rowid allocator that is used to ask for a new rowid. The idea is to have single atomic i64 that we can increment when we get asked to allocate a new rowid.	2025-12-19 10:43:22 +01:00
Pere Diaz Bou	3141c067ed	tcl: exclude partial index for mvcc tcl tests	2025-12-19 10:43:22 +01:00
Pere Diaz Bou	ad88e56e86	core/execute: use same code for generating rowid in mvcc	2025-12-19 10:43:22 +01:00
Elina	694a030ca7	cargo fmt on correct rust version	2025-12-19 17:41:44 +08:00
Elina	e72a26aff9	Use u64::from instead of .into() In certain edge cases, Into::into type inference fails and causes library to fail to build in downstream crates	2025-12-19 17:35:04 +08:00
pedrocarlo	27cd107185	add `btree_resident` field in RowVersion to track if the insert and deletion is originally from a btree	2025-12-17 10:55:25 -03:00
pedrocarlo	87dd5ce455	we should not use `is_none_or` when checking if row version exists in the DB file, as if self.checkpointed_txid_max_old == None it could mean the MvStore recently initialized or we are dealing with an empty database. In both cases, we cannot assert the row version exists in the db file	2025-12-17 10:55:25 -03:00
pedrocarlo	257dc5ad09	do not initiate a write transaction for journal mode + checkpoint before changing mode	2025-12-17 10:55:24 -03:00
pedrocarlo	323f1152d8	emit Checkpoint when setting new journal mode + adjust init code to correctly open the mv store	2025-12-17 10:55:24 -03:00
Pekka Enberg	4bad0f8d59	core: Make Pager thread-safe	2025-12-16 19:50:03 +02:00
Pere Diaz Bou	f69cee9dd9	core/mvcc/cursor: return previous max id	2025-12-15 16:15:36 +01:00
Jussi Saurio	8d6a543987	Merge 'core/mvcc/cursor: add missing reset state in `next`' from Pere Diaz Bou <!-- CURSOR_SUMMARY --> > [!NOTE] > Clears `state` in `MvccLazyCursor` when `next()` reaches `End` and when `prev()` reaches `BeforeFirst`, preventing stale iteration state. > > - MVCC Cursor (`core/mvcc/cursor.rs`): > - `next()`: when reaching `CursorPosition::End`, now resets `self.state` before returning. > - `prev()`: when reaching `CursorPosition::BeforeFirst`, now resets `self.state` before returning. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `dc8ae2c2e4`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> Closes #4161	2025-12-11 18:36:08 +02:00
Jussi Saurio	9e352697b4	Merge 'Get mutable reference to table in Schema so we can modify it with `Arc::make_mut`' from Pedro Muniz ## Description If we use `btree()`, it creates a clone of the value inside the `Table`, which means the number of ref counts > 1. This means that if we you try to use `Arc::make_mut`, it will just clone the Btree table, and never change the actual schema <!-- Please include a summary of the changes and the related issue. --> ## Motivation and context My fuzzer was failing in #4074 with negative rootpages <!-- Please include relevant motivation and context. Link relevant issues here. --> ## AI Disclosure None <!-- Please disclose if any LLM's were used in the creation of this PR and to what extent, to help maintainers properly review. --> Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4181	2025-12-11 18:35:55 +02:00
pedrocarlo	73d06ade02	also check for None `checkpointed_txid_max_old` when determining if `RowVersion` exists in the Db	2025-12-11 13:20:36 -03:00
pedrocarlo	471231c09f	Get mutable reference to table in Schema so we can modify it with `Arc::make_mut`	2025-12-11 13:19:34 -03:00
Pere Diaz Bou	dc8ae2c2e4	core/mvcc/cursor: add missing reset state in prev	2025-12-11 12:46:05 +01:00
pedrocarlo	f593d74a57	Checkpoint state machine should consider TxId that are >= than checkpointed_txid_max_old when checkpointing	2025-12-10 16:50:38 -03:00
Jussi Saurio	83de40cea3	Merge 'core/mvcc/cursor: implement count' from Pere Diaz Bou Sadly, due to how we use dual cursors, we cannot use optimization under btree cursor to count rows without first checking if the row in btree is valid. So this is a slow count implementation. <!-- CURSOR_SUMMARY --> --- > [!NOTE] > Adds a state-driven `count()` implementation that iterates via dual cursors, validating B-Tree keys and tallying visible rows. > > - Core (MVCC Cursor): > - Counting: > - Implement `count()` using a small state machine (`CountState`) to iterate (`rewind` → `next`) and tally rows, ensuring B-Tree keys are validated via existing dual-cursor logic. > - State Management: > - Add `CountState` enum and `count_state` field to `MvccLazyCursor` to keep count logic isolated from other cursor states. > - Initialize `count_state` in `new()`. > > <sup>Written by [Cursor Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit `356ea0869d`. This will update automatically on new commits. Configure [here](https://cursor.com/dashboard?tab=bugbot).</sup> <!-- /CURSOR_SUMMARY --> Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4160	2025-12-10 19:05:01 +02:00
pedrocarlo	8f28aafa3c	apply claude fix for `OpenDup`	2025-12-10 13:33:48 -03:00
Pere Diaz Bou	ee6c415a4d	core/mvcc/cursor: add missing reset state in `next`	2025-12-10 16:50:15 +01:00
Pere Diaz Bou	356ea0869d	core/mvcc/cursor: implement count	2025-12-10 16:41:43 +01:00
pedrocarlo	ee3d2bc863	maybe write page1 when reading an already initialized header	2025-12-10 11:33:49 -03:00
pedrocarlo	2578723939	initialize global header on bootstrap	2025-12-10 00:48:32 -03:00
Jussi Saurio	2aefb4ee8c	Merge 'fix/btree: disable move_to_rightmost optimization with triggers' from Jussi Saurio Some checks are pending Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions Details Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions Details Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions Details Python / configure-strategy (push) Waiting to run Details Python / test (push) Blocked by required conditions Details Python / lint (push) Waiting to run Details Python / linux (x86_64) (push) Waiting to run Details Python / macos-arm64 (aarch64) (push) Waiting to run Details Python / sdist (push) Waiting to run Details Python / Release (push) Blocked by required conditions Details Rust / cargo-fmt-check (push) Waiting to run Details Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run Details Rust / build-native (macos-latest) (push) Waiting to run Details Rust / build-native (windows-latest) (push) Waiting to run Details Rust / clippy (push) Waiting to run Details Rust / simulator (push) Waiting to run Details Rust / test-limbo (push) Waiting to run Details Rust / test-sqlite (push) Waiting to run Details Rust Benchmarks+Nyrkiö / bench (push) Waiting to run Details Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run Details Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run Details Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run Details Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run Details ## Closes - Closes #4017 - Addresses #4043; this now fails with `Page cache is full` with 100k pages, which is a separate non-corruption issue. Modifying max page cache size to be 10 million pages makes it not finish at all. We should modify the issue after this is merged to reflect what the new problem is. The queries in the issue (#4043) create a WAL that is at least 1.7 GB in size ## Background We have an optimization in the btree where if: - We want to reach the rightmost leaf page, and - We know the rightmost page and are already on it Then we can skip a seek. ## Problem The problem is this optimization should NEVER be used in cases where we cannot be sure that the btree wasn't modified from under us e.g. by a trigger subprogram. ## Fix Hence, disable it when we are executing a parent program that has triggers which will fire. ## AI Disclosure No AI was used for this PR. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #4135	2025-12-09 10:02:11 +02:00
Jussi Saurio	0156fa55b6	Merge ' tcl,makefile: add tcl test infraestructure for mvcc ' from Pere Diaz Bou The idea is to have a custom `all-mvcc.test` so we can add `.test` files that we expect to work with MVCC. In cases where files are not enough we have `is_turso_mvcc` to check if we want to run a test. For example we skip partial index tests like this: ``` if {![is_turso_mvcc]} { do_execsql_test_on_specific_db {:memory:} autoinc-conflict-on-nothing { CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT); CREATE UNIQUE INDEX idx_k_partial ON t(k) WHERE id > 1; INSERT INTO t (k) VALUES ('a'); INSERT INTO t (k) VALUES ('a'); INSERT INTO t (k) VALUES ('a') ON CONFLICT DO NOTHING; INSERT INTO t (k) VALUES ('b'); SELECT * FROM t ORDER BY id; } {1\|a 2\|a 4\|b} } ``` `test-mvcc-compat` is not run under CI for now as we need to fix every test anyways so no point in making every PR fail for now. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4139	2025-12-09 09:40:20 +02:00
Pere Diaz Bou	42c070c428	core/mvcc: fix bounds check new rowid fixes `autoinc-fail-on-max-rowid` for now but we still not support random row id since we need to add locking to regular `OP NewRowId`	2025-12-08 18:11:07 +01:00
Pere Diaz Bou	7d1e838ab2	core/mvcc: enable test_cursor_with_btree_and_fuzz	2025-12-08 18:05:45 +01:00
Pere Diaz Bou	4bf5099db9	core/mvcc/tests: clippy	2025-12-08 14:41:58 +01:00
Pere Diaz Bou	d2b2488eaa	core/mvcc/cursor: fix get_next_rowid once again	2025-12-08 14:37:00 +01:00
Pere Diaz Bou	3c2727abe8	core/mvcc/cursor: ignore non visible rows on "last"	2025-12-08 13:45:06 +01:00
Jussi Saurio	12cec1cb70	fix/btree: disable move_to_rightmost optimization with triggers We have an optimization in the btree where if: - We want to reach the rightmost leaf page, and - We know the rightmost page and are already on it Then we can skip a seek. The problem is this optimization should NEVER be used in cases where we cannot be sure that the btree wasn't modified from under us e.g. by a trigger subprogram. Hence, disable it when we are executing a parent program that has triggers which will fire. No AI was used for this PR.	2025-12-08 14:30:39 +02:00
Pere Diaz Bou	4657c947eb	core/mvcc/tests: un-ignore seek tests	2025-12-08 13:00:13 +01:00
Jussi Saurio	826ca4d44d	chore: remove experimental_indexes feature flags	2025-12-08 13:00:37 +02:00
Jussi Saurio	eb782ce2d4	fix/mvcc: seek() must seek from both mv store and btree for example, upon opening an existing database, all the rows are in the btree, so if we seek only from MV store, we won't find anything. ergo: we must look from both the mv store and the btree. if we are iterating forwards, the smallest of the two results is where we land, and vice versa for backwards iteration. initially this implementation used blocking IO but was refactored to use state machines after the rest of the Cursor methods in the MVCC cursor module were refactored to do that too. --- this PR was initially almost entirely written using Claude Code + Opus 4.5, but heavily manually cleaned up as the AI made the state machine refactor far too complicated.	2025-12-05 11:53:16 +02:00
Pere Diaz Bou	3cca372a97	core/mvcc: clippy and some renaming	2025-12-04 19:32:43 +01:00
Pere Diaz Bou	ba1dff71d7	core/mvcc: get_next_rowid return on last io	2025-12-04 19:32:43 +01:00
Pere Diaz Bou	3b10370a30	core/mvcc: reset state on prev	2025-12-04 19:31:41 +01:00
Pere Diaz Bou	7853a47292	core/mvcc/cursor: prev	2025-12-04 19:31:41 +01:00
Pere Diaz Bou	e3cfbfbcd8	core/mvcc: state machine for `next` and `rewind`	2025-12-04 19:31:41 +01:00
Pere Diaz Bou	6a88bc79aa	core/mvcc/cursor: add necessary structs for different state machines	2025-12-04 19:31:41 +01:00
Pere Diaz Bou	65623b58a8	core/mvcc: test seek after checkpoint	2025-12-04 19:31:41 +01:00
Jussi Saurio	25669c7106	fix/mvcc: always reinitialize index iterator on seek if this is not done, a stale iterator from a previous seek will be used, returning incorrect results.	2025-12-03 16:24:48 +02:00
pedrocarlo	02c2a63d8e	use ArcSwap to store MvStore	2025-12-03 10:09:04 -03:00
Jussi Saurio	6b4c6fc93e	fix/mvcc: use existing schema object in mvcc bootstrap existing bootstrap had a half-baked implementation for getting root pages from the DB file via a Statement, but it's unnecessary because we have already parsed the schema at that point, so we can just use it directly.	2025-12-03 12:56:04 +02:00
Jussi Saurio	830cb7121b	Merge 'Mvcc bugfixes' from Jussi Saurio - Do not assume all cursors are mvcc cursors in `op_new_rowid` - Fix bug in logical log reader where it would ignore already buffered bytes and try to read too much past the end of the file - Add safeguard to the same reader logic where it issues multiple reads if it gets a short read for some reason Both of these issues were found using `cd simulator && cargo run -- --profile simple_mvcc`, using `num_connections=1` Closes #4077	2025-12-03 12:55:58 +02:00

1 2 3 4 5 ...

510 commits