## Description
BTreeCursor sets null flag to false once `seek` is called. This PR does
the same for MVCC
## Motivation and context
join.test failed with some cases due to this bug
## Description of AI Usage
I asked AI to find the issue but I ended showing the agent why he did
things wrong and that he should be ashamed
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4296
BusySnapshot indicates the transaction's snapshot is permanently stale
and must be rolled back. Unlike Busy, retrying with busy_timeout will
never help - the caller must rollback and restart the transaction.
This PR fixes incorrect conversion from TEXT to INTEGER when text is a
number followed by a trailing non-breaking space.
This happens because `str::trim()` trims non-breaking space and unicode
whitespace while SQLite only trims ASCII whitespace.
Closes: https://github.com/tursodatabase/turso/issues/3679
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3878
## Description
in `op_new_rowid` we already have code logic that encodes how to get the
last rowid correctly, this PR uses advantage of it in MVCC too but with
a few `lock` guards in place to not collide rowids
## Motivation and context
It is hard to maintain two ways of getting a new rowid so this tries to
fold mvcc with btree
## Description of AI Usage
None
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4228
The DX is right now pretty terrible:
```
penberg@vonneumann turso % cargo run -- hello.db
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s
Running `target/debug/tursodb hello.db`
Turso v0.4.0-pre.18
Enter ".help" for usage hints.
Did you know that Turso supports live materialized views? Type .manual materialized-views to learn more.
This software is in BETA, use caution with production data and ensure you have backups.
turso> PRAGMA journal_mode = 'experimental_mvcc';
× Invalid argument supplied: MVCC is not enabled. Enable it with `--experimental-mvcc` flag in the CLI or by setting the MVCC option in `DatabaseOpts`
turso>
```
To add insult to the injury, many SDKs don't even have a way to enable
MVCC via database options. Therefore, let's remove the flag altogether.
RowidAllocator is a centralized lock protected rowid allocator that is
used to ask for a new rowid. The idea is to have single atomic i64 that
we can increment when we get asked to allocate a new rowid.
## Description
Remove sync IO hacks for `op_journal_mode`
Close#4268
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Remove sync io hacks so it is friendlier for WASM
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Ai basically made the bulk refactoring and I made some adjustments and
trimmed down the implementation
**Prompt**:
```
if look at @core/storage/journal_mode.rs and `op_journal_mode` in `execute.rs` you will see that we have some blocking io operations with
`pager.io.block` and `program.connection.checkpoint` that also blocks. I want you refactor the code to use state machines similar in nature to how we do it
in many functions in `execute.rs`
```
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4279
## Description
Closes#4146
## Motivation and context
panics are bad
## AI Disclosure
none used
Reviewed-by: Pedro Muniz (@pedrocarlo)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4177
Closes#3536
# Description
This PR implements **dynamic journal mode switching** via `PRAGMA
journal_mode`, allowing users to switch between WAL and MVCC modes at
runtime.
### Key Changes
**Core Feature: Journal Mode Switching**
- Added new `JournalMode` module (`core/storage/journal_mode.rs`) to
parse and handle journal mode transitions
- Modified `op_journal_mode` to correctly parse journal modes and update
the database header
- Emit checkpoint when setting a new journal mode to ensure data
consistency
- Added MVCC checkpoint support to `Connection::checkpoint`
**Database Initialization Improvements**
- Read DB header on `Database::open` and simplified `init_pager`
- Made `Version` an enum for better comparison semantics
- Automatically convert legacy SQLite databases to WAL mode
- Ensure DB header is flushed to disk when header changes during open
- Clear page cache after header validation
**Bug Fixes**
- Fixed dirty page invalidation in pager when clearing dirty pages in
page cache
- Fixed `is_none_or` check for row version existence in DB file (handles
MvStore initialization and empty database cases)
- Added `btree_resident` field in `RowVersion` to track if
insert/deletion originated from a btree
**Testing**
- Added fuzz tests for `journal_mode` transitions with Database
operations in between
- Added integration tests for testing switching from the different modes
while checking the header version is correct
- Added some specific regression tests for delete operations lost on
mode switch
- Fixed `index_scan_compound_key_fuzz` to use separate databases for
Turso and SQLite in MVCC mode. Also had to decrease number of rows for
MVCC test, as insert was very slow.
# TODO's
- Remove sync hacks from `op_journal_mode`
- Expand fuzzer with different queries
- Add to Simulator
- Special handling for read only databases and not allow any header
changes
# Motivation and context
Facilitate our users to test MVCC and transition back and forth from it.
# AI Disclosure
Used AI to catch and fix bugs in MVCC, further my understanding with
MVCC, write tests in `tests` folder, most of the PR summary, and the
docs in the `docs/manual.md` file
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4074
This is already done in abort() but if the checkpoint error is
returned directly from commit_tx() instead of in an IO completion,
then the Tx state will be cleared and the DDL change won't be applied.
So, do it explicitly in step_end_write_txn() too.
Fixes this failing seed:
cargo run -- --memory-io --seed 12902559987470385066 --disable-heuristic-shrinking
## Description
Does solve #4154, but I don't want to close it with this PR, because it
does not solve the Affinity issue.
We can only use an index to iterate over if the column collation in the
order by clause matches the index collation
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Fix a bug in the optimizer
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Used AI to write tests, fuzzers, and help me understand the optimizer
code.
Test prompt:
<details>
can you write tests in tcl that test that the correct collation sequence
is properly maintained.
```
CREATE TABLE "t1" ("c1" TEXT COLLATE RTRIM);
INSERT INTO "t1" VALUES (' ');
CREATE INDEX "i1" ON "t1" ("c1" COLLATE RTRIM DESC);
INSERT INTO "t1" VALUES (1025.1655084065987);
SELECT "c1", typeof(c1) FROM "t1" ORDER BY "c1" COLLATE BINARY DESC, rowid ASC;
```
this is an example of a query that returned incorrect results because of
this
</details>
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4248
## Beef
Centralize all checkpointing into a single state machine.
- Get rid of `wal_checkpoint_start`, `wal_checkpoint_finish`,
`wal_checkpoint`
- Rename `wal_checkpoint` to `blocking_checkpoint`
- Handle truncation and fsyncing in the single checkpoint state machine
and remove separate fsyncing steps from `commit_dirty_pages_inner` -
essentially we were fsyncing the DB file twice
## Review guide
Commit by commit is useful. I generated and cleaned up a larger single
commit with Opus first and then asked it to restructure it as a series
of atomic commits that sum up to an identical diff. The last commit is a
manual fix.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4255
- not all systems has cmath functions which we import from system
libraries
- let's use external implementation only in tests in order to eliminate
precision errors in the differential tests
- https://discord.com/channels/933071162680958986/933071163184283651/145
0476358005293147
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4246
## Description
This PR fixes underflow issue when updating integer primary keys,
aligning with SQLite's `datatype mismatch` behavior. Previously, updates
would silently reuse the same id in release mode and panic in debug
mode. This PR also changes the `datatype mismatch` errors from parse
errors to runtime errors to match SQLite behavior.
```
turso> create table t(a integer primary key);
turso> insert into t values (-9223372036854775808);
turso> update t set a = a - 1;
× Runtime error: datatype mismatch
```
Additionally, this PR introduces a second commit to fix integer/float
equivalence in numeric string comparisons. Although this change is not
related to the PR, it is necessary for the newly introduced unit tests
in `core/util` to pass, as existing tests are failing on `master`.
## Motivation and context
Fixes#3848.
Turso's current behavior:
```
turso> create table t(a integer primary key);
turso> insert into t values (-9223372036854775808);
turso> update t set a = a - 1; -- this can be repeated many times, it's ignored
turso> select * from t;
┌──────────────────────┐
│ a │
├──────────────────────┤
│ -9223372036854775808 │
└──────────────────────┘
```
Expected behavior:
```
sqlite> create table t(a integer primary key);
sqlite> insert into t values (-9223372036854775808);
sqlite> update t set a = a - 1;
Runtime error: datatype mismatch (20)
```
## Description of AI Usage
GPT-5.1 Codex Max helped identify the underflow path.
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4251
- Remove OpCheckpointState enum (no longer needed)
- Remove op_checkpoint_inner function
- Simplify op_checkpoint to call checkpoint() directly
- Remove op_checkpoint_state field from ProgramState
- Fix MVCC path to properly return after completing (was using break)
The checkpoint() function now handles all the state machine logic internally,
so the VDBE op just needs to call it and handle the result.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
After flushing the WAL, the db may attempt an automatic checkpoint. Previously,
a checkpoint failure for any other reason than `Busy` rolled back the transaction;
this is incorrect because the WAL has already been written and synced, so the transaction
is actually committed and durable.
For this reason, do the following decoupling:
- Mark the transaction as committed after `SyncWal`
- Any errors beyond that point are wrapped as `LimboError::CheckpointFailed` which is
handled specially:
* Tx rollback is not attempted in `abort()` - only the WAL locks are cleaned up
and checkpoint state machines are reset
* In the simulator, the results of the query are shadowed into the sim environment
so that the sim correctly assumes that the transaction's effects were in fact
committed.
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.
Spotted by Whopper.
Closes#4220
## Description
This PR adds hash matching for equivalent integer and real values in
hash joins. This is achieved by ensuring that integer/real equivalents
(including signed zeros) share the same hash in internal bloom filters
and hash tables.
```
turso> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
┌───┬────┬───┬──────┐
│ a │ b │ a │ c │
├───┼────┼───┼──────┤
│ 1 │ │ │ │
├───┼────┼───┼──────┤
│ 2 │ 10 │ 1 │ 10.0 │
└───┴────┴───┴──────┘
```
## Motivation and context
This change fixes the `LEFT JOIN` mismatch reported in #4147, where
joins on numerically equal `INTEGER` and `REAL` values failed in Turso
but succeeded in SQLite:
```
turso> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
┌───┬────┬───┬───┐
│ a │ b │ a │ c │
├───┼────┼───┼───┤
│ 1 │ │ │ │
├───┼────┼───┼───┤
│ 2 │ 10 │ │ │
└───┴────┴───┴───┘
```
```
sqlite> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
sqlite> CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
sqlite> INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
sqlite> INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
sqlite> SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
1|||
2|10|1|10.0
```
Fixes#4147.
## Description of AI Usage
This PR was developed with assistance from GPT-5.1 Codex Max. The AI
helped analyze the hash join–related codebase (including bloom filters
and hash table implementations), identify the root cause of the issue,
and assist in writing the tests.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4226
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.
Spotted by Whopper.
## Beef
- Add `./scripts/run-sqlancer.sh` script to run
[SQLancer](https://github.com/sqlancer/sqlancer) using Turso's Java
bindings.
> SQLancer is a tool to automatically test Database Management Systems
(DBMSs) in order to find bugs in their implementation. That is, it finds
bugs in the code of the DBMS implementation, rather than in queries
written by the user. SQLancer has found hundreds of bugs in mature and
widely-known DBMSs.
- Fix some bugs that were already found by running it
## Reader's guide to the PR
- Commit by commit reviewing is probably best since the java bindings
changes, turso core bugfixes, and the sqlancer vibecode are all
separated into commits.
## AI Disclosure
Heavy Opus 4.5 vibecoding. I just started with `"This is Turso, the Rust
rewrite of SQlite. Let's investigate ways to run SQLancer against it"`,
and went from there.
I seriously have no idea if this is the least-effort way of doing it,
but it works, so I think that's a good enough start.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#4180
This PR (mostly) finishes implementing support for `ANALYZE`
It also uses this newly available metadata to improve calculating the
join order.
### Example Queries:
Both the same query, different order:
<img width="757" height="928" alt="image" src="https://github.com/user-
attachments/assets/82edd3bc-ef33-4df0-833d-92106bf4c065" />
Previously, tursodb would have changed the build table when the query
was written with `users` on the RHS. Now that we have the metadata
available, we are able to determine that `products` should _always_ be
the build table for inner equijoin/hash join.
=======================
### AI disclosure
A lot of the emission code in `core/translate/analyze.rs` was written by
codex.
EDIT: Opus 4.5 was monumental in the cost based optimization work here.
That remains to be seen whether or not it succeeded XD
Closes#4141
On bootstrap just store the header but not flush it to disk. Only try to
flush it when we start an MVCC transaction. Also applied fix in
`OpenDup` where we should not wrap an ephemeral table with an MvCursor
Reviewed-by: Mikaël Francoeur (@LeMikaelF)
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4151