Commit graph

510 commits

Author SHA1 Message Date
Preston Thorpe
2e794934dd
Merge 'Use u64::from instead of .into()' from Elina
## Description
In certain edge cases, Into::into type inference fails and causes
library to fail to build in downstream crates
## Motivation and context
See https://github.com/ranile/turso-compile-fail
In this case, patching chrono to work with `ic-cdk` fails compilation of
turso. `ic-cdk` adds a dependency on `candid`, which provides some trait
implementations that make Into::into type inference break.
<details><summary>Error message</summary>
```
error[E0283]: type annotations needed
   --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:216:71
    |
216 |                         .is_some_and(|txid_max_old| b <= txid_max_old.into())
    |                                                       --              ^^^^
    |                                                       |
    |                                                       type must be known at this point
    |
    = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`:
            - impl PartialOrd for u64;
            - impl PartialOrd<candid::types::number::Int> for u64;
            - impl PartialOrd<candid::types::number::Nat> for u64;
help: try using a fully qualified path to specify the expected types
    |
216 -                         .is_some_and(|txid_max_old| b <= txid_max_old.into())
216 +                         .is_some_and(|txid_max_old| b <= <std::num::NonZero<u64> as Into<T>>::into(txid_max_old))
    |

error[E0283]: type annotations needed
   --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:226:67
    |
226 |                     .is_some_and(|txid_max_old| e <= txid_max_old.into())
    |                                                   --              ^^^^
    |                                                   |
    |                                                   type must be known at this point
    |
    = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`:
            - impl PartialOrd for u64;
            - impl PartialOrd<candid::types::number::Int> for u64;
            - impl PartialOrd<candid::types::number::Nat> for u64;
help: try using a fully qualified path to specify the expected types
    |
226 -                     .is_some_and(|txid_max_old| e <= txid_max_old.into())
226 +                     .is_some_and(|txid_max_old| e <= <std::num::NonZero<u64> as Into<T>>::into(txid_max_old))
    |

error[E0283]: type annotations needed
   --> /Users/me/.cargo/git/checkouts/turso-455557cd4a2364c7/2d78b53/core/mvcc/database/checkpoint_state_machine.rs:242:90
    |
242 |                     .is_none_or(|txid_max_old| begin_ts.is_some_and(|b| b > txid_max_old.into()));
    |                                                                           -              ^^^^
    |                                                                           |
    |                                                                           type must be known at this point
    |
    = note: multiple `impl`s satisfying `u64: PartialOrd<_>` found in the following crates: `candid`, `core`:
            - impl PartialOrd for u64;
            - impl PartialOrd<candid::types::number::Int> for u64;
            - impl PartialOrd<candid::types::number::Nat> for u64;
help: try using a fully qualified path to specify the expected types
    |
242 -                     .is_none_or(|txid_max_old| begin_ts.is_some_and(|b| b > txid_max_old.into()));
242 +                     .is_none_or(|txid_max_old| begin_ts.is_some_and(|b| b > <std::num::NonZero<u64> as Into<T>>::into(txid_max_old)));
    |

For more information about this error, try `rustc --explain E0283`.
error: could not compile `turso_core` (lib) due to 3 previous errors
```
</details>
## Description of AI Usage
All code is hand-written

Reviewed-by: Nikita Sivukhin (@sivukhin)

Closes #4293
2025-12-19 14:46:33 -05:00
Pere Diaz Bou
2dec8a6e00
Merge ' core/execute: use same code for generating rowid in mvcc as in btree' from Pere Diaz Bou
## Description
in `op_new_rowid` we already have code logic that encodes how to get the
last rowid correctly, this PR uses advantage of it in MVCC too but with
a few `lock` guards in place to not collide rowids
## Motivation and context
It is hard to maintain two ways of getting a new rowid so this tries to
fold mvcc with btree
## Description of AI Usage
None

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4228
2025-12-19 16:45:27 +01:00
Pere Diaz Bou
95cc293508 core/mvcc: Fix typos and use turso_assert
Use turso_assert to check cursor state, clean up lock handling,
and rename fields from last_rowid/intialized to max_rowid/initialized
2025-12-19 12:52:01 +01:00
Pekka Enberg
edd45ff7b8 Improve MVCC DX by dropping --experimental-mvcc flag
The DX is right now pretty terrible:

```
penberg@vonneumann turso % cargo run -- hello.db
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s
     Running `target/debug/tursodb hello.db`
Turso v0.4.0-pre.18
Enter ".help" for usage hints.
Did you know that Turso supports live materialized views? Type .manual materialized-views to learn more.
This software is in BETA, use caution with production data and ensure you have backups.
turso> PRAGMA journal_mode = 'experimental_mvcc';
  × Invalid argument supplied: MVCC is not enabled. Enable it with `--experimental-mvcc` flag in the CLI or by setting the MVCC option in `DatabaseOpts`

turso>
```

To add insult to the injury, many SDKs don't even have a way to enable
MVCC via database options. Therefore, let's remove the flag altogether.
2025-12-19 12:59:42 +02:00
Pere Diaz Bou
5b4bcfa508 core/mvcc: fix clippy issues 2025-12-19 11:26:05 +01:00
Pere Diaz Bou
77ab8c9085 core/mvcc: introduce RowidAllocator
RowidAllocator is a centralized lock protected rowid allocator that is
used to ask for a new rowid. The idea is to have single atomic i64 that
we can increment when we get asked to allocate a new rowid.
2025-12-19 10:43:22 +01:00
Pere Diaz Bou
3141c067ed tcl: exclude partial index for mvcc tcl tests 2025-12-19 10:43:22 +01:00
Pere Diaz Bou
ad88e56e86 core/execute: use same code for generating rowid in mvcc 2025-12-19 10:43:22 +01:00
Elina
694a030ca7
cargo fmt on correct rust version 2025-12-19 17:41:44 +08:00
Elina
e72a26aff9
Use u64::from instead of .into()
In certain edge cases, Into::into type inference fails and causes library to fail to build in downstream crates
2025-12-19 17:35:04 +08:00
pedrocarlo
27cd107185 add btree_resident field in RowVersion to track if the insert and deletion is originally from a btree 2025-12-17 10:55:25 -03:00
pedrocarlo
87dd5ce455 we should not use is_none_or when checking if row version exists in
the DB file, as if self.checkpointed_txid_max_old == None it could mean
the MvStore recently initialized or we are dealing with an empty
database. In both cases, we cannot assert the row version exists in the
db file
2025-12-17 10:55:25 -03:00
pedrocarlo
257dc5ad09 do not initiate a write transaction for journal mode + checkpoint before changing mode 2025-12-17 10:55:24 -03:00
pedrocarlo
323f1152d8 emit Checkpoint when setting new journal mode + adjust init code to correctly open the mv store 2025-12-17 10:55:24 -03:00
Pekka Enberg
4bad0f8d59 core: Make Pager thread-safe 2025-12-16 19:50:03 +02:00
Pere Diaz Bou
f69cee9dd9 core/mvcc/cursor: return previous max id 2025-12-15 16:15:36 +01:00
Jussi Saurio
8d6a543987
Merge 'core/mvcc/cursor: add missing reset state in next' from Pere Diaz Bou
<!-- CURSOR_SUMMARY -->
> [!NOTE]
> Clears `state` in `MvccLazyCursor` when `next()` reaches `End` and
when `prev()` reaches `BeforeFirst`, preventing stale iteration state.
>
> - **MVCC Cursor (`core/mvcc/cursor.rs`)**:
>   - `next()`: when reaching `CursorPosition::End`, now resets
`self.state` before returning.
>   - `prev()`: when reaching `CursorPosition::BeforeFirst`, now resets
`self.state` before returning.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
dc8ae2c2e4. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Closes #4161
2025-12-11 18:36:08 +02:00
Jussi Saurio
9e352697b4
Merge 'Get mutable reference to table in Schema so we can modify it with Arc::make_mut' from Pedro Muniz
## Description
If we use `btree()`, it creates a clone of the value inside the `Table`,
which means the number of ref counts > 1. This means that if we you try
to use `Arc::make_mut`, it will just clone the Btree table, and never
change the actual schema
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
My fuzzer was failing in #4074 with negative rootpages
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## AI Disclosure
None
<!--
Please disclose if any LLM's were used in the creation of this PR and to
what extent,
to help maintainers properly review.
-->

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4181
2025-12-11 18:35:55 +02:00
pedrocarlo
73d06ade02 also check for None checkpointed_txid_max_old when determining if RowVersion exists in the Db 2025-12-11 13:20:36 -03:00
pedrocarlo
471231c09f Get mutable reference to table in Schema so we can modify it with
`Arc::make_mut`
2025-12-11 13:19:34 -03:00
Pere Diaz Bou
dc8ae2c2e4 core/mvcc/cursor: add missing reset state in prev 2025-12-11 12:46:05 +01:00
pedrocarlo
f593d74a57 Checkpoint state machine should consider TxId that are >= than
checkpointed_txid_max_old when checkpointing
2025-12-10 16:50:38 -03:00
Jussi Saurio
83de40cea3
Merge 'core/mvcc/cursor: implement count' from Pere Diaz Bou
Sadly, due to how we use dual cursors, we cannot use optimization under
btree cursor to count rows without first checking if the row in btree is
valid. So this is a slow count implementation.
<!-- CURSOR_SUMMARY -->
---
> [!NOTE]
> Adds a state-driven `count()` implementation that iterates via dual
cursors, validating B-Tree keys and tallying visible rows.
>
> - **Core (MVCC Cursor)**:
>   - **Counting**:
>     - Implement `count()` using a small state machine (`CountState`)
to iterate (`rewind` → `next`) and tally rows, ensuring B-Tree keys are
validated via existing dual-cursor logic.
>   - **State Management**:
>     - Add `CountState` enum and `count_state` field to
`MvccLazyCursor` to keep count logic isolated from other cursor states.
>     - Initialize `count_state` in `new()`.
>
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
356ea0869d. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4160
2025-12-10 19:05:01 +02:00
pedrocarlo
8f28aafa3c apply claude fix for OpenDup 2025-12-10 13:33:48 -03:00
Pere Diaz Bou
ee6c415a4d core/mvcc/cursor: add missing reset state in next 2025-12-10 16:50:15 +01:00
Pere Diaz Bou
356ea0869d core/mvcc/cursor: implement count 2025-12-10 16:41:43 +01:00
pedrocarlo
ee3d2bc863 maybe write page1 when reading an already initialized header 2025-12-10 11:33:49 -03:00
pedrocarlo
2578723939 initialize global header on bootstrap 2025-12-10 00:48:32 -03:00
Jussi Saurio
2aefb4ee8c
Merge 'fix/btree: disable move_to_rightmost optimization with triggers' from Jussi Saurio
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
## Closes
- Closes #4017
- Addresses #4043; this now fails with `Page cache is full` with 100k
pages, which is a separate non-corruption issue. Modifying max page
cache size to be 10 million pages makes it not finish at all. We should
modify the issue after this is merged to reflect what the new problem
is. The queries in the issue (#4043) create a WAL that is at least 1.7
GB in size
## Background
We have an optimization in the btree where if:
- We want to reach the rightmost leaf page, and
- We know the rightmost page and are already on it
Then we can skip a seek.
## Problem
The problem is this optimization should NEVER be used in cases where we
cannot be sure that the btree wasn't modified from under us e.g. by a
trigger subprogram.
## Fix
Hence, disable it when we are executing a parent program that has
triggers which will fire.
## AI Disclosure
No AI was used for this PR.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4135
2025-12-09 10:02:11 +02:00
Jussi Saurio
0156fa55b6
Merge ' tcl,makefile: add tcl test infraestructure for mvcc ' from Pere Diaz Bou
The idea is to have a custom `all-mvcc.test` so we can add `.test` files
that we expect to work with MVCC. In cases where files are not enough we
have  `is_turso_mvcc` to check if we want to run a test.
For example we skip partial index tests like this:
```
if {![is_turso_mvcc]} {
    do_execsql_test_on_specific_db {:memory:} autoinc-conflict-on-nothing {
        CREATE TABLE t (id INTEGER PRIMARY KEY AUTOINCREMENT, k TEXT);
        CREATE UNIQUE INDEX idx_k_partial ON t(k) WHERE id > 1;
        INSERT INTO t (k) VALUES ('a');
        INSERT INTO t (k) VALUES ('a');
        INSERT INTO t (k) VALUES ('a') ON CONFLICT DO NOTHING;
        INSERT INTO t (k) VALUES ('b');
        SELECT * FROM t ORDER BY id;
    } {1|a 2|a 4|b}
}
```
`test-mvcc-compat` is not run under CI for now as we need to fix every
test anyways so no point in making every PR fail for now.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4139
2025-12-09 09:40:20 +02:00
Pere Diaz Bou
42c070c428 core/mvcc: fix bounds check new rowid
fixes `autoinc-fail-on-max-rowid` for now but we still not support
random row id since we need to add locking to regular `OP NewRowId`
2025-12-08 18:11:07 +01:00
Pere Diaz Bou
7d1e838ab2 core/mvcc: enable test_cursor_with_btree_and_fuzz 2025-12-08 18:05:45 +01:00
Pere Diaz Bou
4bf5099db9 core/mvcc/tests: clippy 2025-12-08 14:41:58 +01:00
Pere Diaz Bou
d2b2488eaa core/mvcc/cursor: fix get_next_rowid once again 2025-12-08 14:37:00 +01:00
Pere Diaz Bou
3c2727abe8 core/mvcc/cursor: ignore non visible rows on "last" 2025-12-08 13:45:06 +01:00
Jussi Saurio
12cec1cb70 fix/btree: disable move_to_rightmost optimization with triggers
We have an optimization in the btree where if:

- We want to reach the rightmost leaf page, and
- We know the rightmost page and are already on it

Then  we can skip a seek.

The problem is this optimization should NEVER be used in cases where we cannot be sure that the btree wasn't modified from under us e.g. by a trigger subprogram.

Hence, disable it when we are executing a parent program that has triggers which
will fire.

No AI was used for this PR.
2025-12-08 14:30:39 +02:00
Pere Diaz Bou
4657c947eb core/mvcc/tests: un-ignore seek tests 2025-12-08 13:00:13 +01:00
Jussi Saurio
826ca4d44d chore: remove experimental_indexes feature flags 2025-12-08 13:00:37 +02:00
Jussi Saurio
eb782ce2d4 fix/mvcc: seek() must seek from both mv store and btree
for example, upon opening an existing database, all the rows are in
the btree, so if we seek only from MV store, we won't find anything.
ergo: we must look from both the mv store and the btree. if we are
iterating forwards, the smallest of the two results is where we land,
and vice versa for backwards iteration.

initially this implementation used blocking IO but was refactored to
use state machines after the rest of the Cursor methods in the MVCC cursor
module were refactored to do that too.

---

this PR was initially almost entirely written using Claude Code + Opus 4.5,
but heavily manually cleaned up as the AI made the state machine refactor
far too complicated.
2025-12-05 11:53:16 +02:00
Pere Diaz Bou
3cca372a97 core/mvcc: clippy and some renaming 2025-12-04 19:32:43 +01:00
Pere Diaz Bou
ba1dff71d7 core/mvcc: get_next_rowid return on last io 2025-12-04 19:32:43 +01:00
Pere Diaz Bou
3b10370a30 core/mvcc: reset state on prev 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
7853a47292 core/mvcc/cursor: prev 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
e3cfbfbcd8 core/mvcc: state machine for next and rewind 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
6a88bc79aa core/mvcc/cursor: add necessary structs for different state machines 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
65623b58a8 core/mvcc: test seek after checkpoint 2025-12-04 19:31:41 +01:00
Jussi Saurio
25669c7106 fix/mvcc: always reinitialize index iterator on seek
if this is not done, a stale iterator from a previous seek
will be used, returning incorrect results.
2025-12-03 16:24:48 +02:00
pedrocarlo
02c2a63d8e use ArcSwap to store MvStore 2025-12-03 10:09:04 -03:00
Jussi Saurio
6b4c6fc93e fix/mvcc: use existing schema object in mvcc bootstrap
existing bootstrap had a half-baked implementation for getting
root pages from the DB file via a Statement, but it's unnecessary
because we have already parsed the schema at that point, so we can
just use it directly.
2025-12-03 12:56:04 +02:00
Jussi Saurio
830cb7121b
Merge 'Mvcc bugfixes' from Jussi Saurio
- Do not assume all cursors are mvcc cursors in `op_new_rowid`
- Fix bug in logical log reader where it would ignore already buffered
bytes and try to read too much past the end of the file
- Add safeguard to the same reader logic where it issues multiple reads
if it gets a short read for some reason
Both of these issues were found using `cd simulator && cargo run --
--profile simple_mvcc`, using `num_connections=1`

Closes #4077
2025-12-03 12:55:58 +02:00