Commit graph

1708 commits

Author SHA1 Message Date
Pekka Enberg
f598dfa13d
Merge 'core/mvcc: set_null_flag(false) when seek is called' from Pere Diaz Bou
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
## Description
BTreeCursor sets null flag to false once `seek` is called. This PR does
the same for MVCC
## Motivation and context
join.test failed with some cases due to this bug
## Description of AI Usage
I asked AI to find the issue but I ended showing the agent why he did
things wrong and that he should be ashamed

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4296
2025-12-22 09:15:24 +02:00
Pekka Enberg
d3714f4120
Merge 'Mark triggers as experimental' from Jussi Saurio
Closes #4022

Closes #4318
2025-12-22 09:08:54 +02:00
Jussi Saurio
2bf7ec5136 mark triggers as experimental 2025-12-21 21:18:15 +02:00
Jussi Saurio
aead47f2a8 Return BusySnapshot instead of Busy for stale snapshot in begin_write_tx
BusySnapshot indicates the transaction's snapshot is permanently stale
and must be rolled back. Unlike Busy, retrying with busy_timeout will
never help - the caller must rollback and restart the transaction.
2025-12-21 18:35:03 +02:00
Preston Thorpe
90e6ed0ba4
Merge 'Fix incorrect conversion from TEXT to INTEGER when text is a number followed by a trailing non-breaking space' from Krishna Vishal
This PR fixes incorrect conversion from TEXT to INTEGER when text is a
number followed by a trailing non-breaking space.
This happens because `str::trim()` trims non-breaking space and unicode
whitespace while SQLite only trims ASCII whitespace.
Closes: https://github.com/tursodatabase/turso/issues/3679

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3878
2025-12-20 10:33:15 -05:00
Pere Diaz Bou
fe94cfc175 core/mvcc: set_null_flag(false) when seek is called 2025-12-19 16:47:33 +01:00
Pere Diaz Bou
2dec8a6e00
Merge ' core/execute: use same code for generating rowid in mvcc as in btree' from Pere Diaz Bou
## Description
in `op_new_rowid` we already have code logic that encodes how to get the
last rowid correctly, this PR uses advantage of it in MVCC too but with
a few `lock` guards in place to not collide rowids
## Motivation and context
It is hard to maintain two ways of getting a new rowid so this tries to
fold mvcc with btree
## Description of AI Usage
None

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4228
2025-12-19 16:45:27 +01:00
Pere Diaz Bou
95cc293508 core/mvcc: Fix typos and use turso_assert
Use turso_assert to check cursor state, clean up lock handling,
and rename fields from last_rowid/intialized to max_rowid/initialized
2025-12-19 12:52:01 +01:00
Pekka Enberg
edd45ff7b8 Improve MVCC DX by dropping --experimental-mvcc flag
The DX is right now pretty terrible:

```
penberg@vonneumann turso % cargo run -- hello.db
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.15s
     Running `target/debug/tursodb hello.db`
Turso v0.4.0-pre.18
Enter ".help" for usage hints.
Did you know that Turso supports live materialized views? Type .manual materialized-views to learn more.
This software is in BETA, use caution with production data and ensure you have backups.
turso> PRAGMA journal_mode = 'experimental_mvcc';
  × Invalid argument supplied: MVCC is not enabled. Enable it with `--experimental-mvcc` flag in the CLI or by setting the MVCC option in `DatabaseOpts`

turso>
```

To add insult to the injury, many SDKs don't even have a way to enable
MVCC via database options. Therefore, let's remove the flag altogether.
2025-12-19 12:59:42 +02:00
Pere Diaz Bou
1da6bfd313 core/execute: end_new_rowid on finish with random rowid 2025-12-19 10:43:22 +01:00
Pere Diaz Bou
77ab8c9085 core/mvcc: introduce RowidAllocator
RowidAllocator is a centralized lock protected rowid allocator that is
used to ask for a new rowid. The idea is to have single atomic i64 that
we can increment when we get asked to allocate a new rowid.
2025-12-19 10:43:22 +01:00
Pere Diaz Bou
ad88e56e86 core/execute: use same code for generating rowid in mvcc 2025-12-19 10:43:22 +01:00
Krishna Vishal
173baa7e0e Add tests 2025-12-19 08:53:58 +05:30
Krishna Vishal
8300f2dae7 Trim only ascii whitespace. Rust's .trim() removes unicode
whitespace too. While SQLite only removes ASCII whitespace.

Closes: https://github.com/tursodatabase/turso/issues/3679
2025-12-19 08:53:57 +05:30
pedrocarlo
da33f421b3 add readonly checks to ensure we do not change the header 2025-12-18 10:54:58 -03:00
Jussi Saurio
caa71ea39c
Merge 'implement state machine for op_journal_mode' from Pedro Muniz
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / test-sqlite (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
## Description
Remove sync IO hacks for `op_journal_mode`
Close #4268
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Remove sync io hacks so it is friendlier for WASM
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Ai basically made the bulk refactoring and I made some adjustments and
trimmed down the implementation
**Prompt**:
```
if look at @core/storage/journal_mode.rs and `op_journal_mode` in `execute.rs` you will see that we have some blocking io operations with
`pager.io.block` and `program.connection.checkpoint` that also blocks. I want you refactor the code to use state machines similar in nature to how we do it
 in many functions in `execute.rs`
```
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4279
2025-12-18 10:12:12 +02:00
Jussi Saurio
00d266665b
Merge 'fix coroutine panic: replace ended_coroutine Bitfield with vec' from Jussi Saurio
## Description
Closes #4146
## Motivation and context
panics are bad
## AI Disclosure
none used

Reviewed-by: Pedro Muniz (@pedrocarlo)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #4177
2025-12-18 09:20:05 +02:00
pedrocarlo
81ef597873 implement state machine for op_journal_mode 2025-12-18 01:23:24 -03:00
Jussi Saurio
eb59bc6ae6 Merge 'Enable MVCC with PRAGMA journal_mode' from Pedro Muniz
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Closes #3536
# Description
This PR implements **dynamic journal mode switching** via `PRAGMA
journal_mode`, allowing users to switch between WAL and MVCC modes at
runtime.
### Key Changes
**Core Feature: Journal Mode Switching**
- Added new `JournalMode` module (`core/storage/journal_mode.rs`) to
parse and handle journal mode transitions
- Modified `op_journal_mode` to correctly parse journal modes and update
the database header
- Emit checkpoint when setting a new journal mode to ensure data
consistency
- Added MVCC checkpoint support to `Connection::checkpoint`
**Database Initialization Improvements**
- Read DB header on `Database::open` and simplified `init_pager`
- Made `Version` an enum for better comparison semantics
- Automatically convert legacy SQLite databases to WAL mode
- Ensure DB header is flushed to disk when header changes during open
- Clear page cache after header validation
**Bug Fixes**
- Fixed dirty page invalidation in pager when clearing dirty pages in
page cache
- Fixed `is_none_or` check for row version existence in DB file (handles
MvStore initialization and empty database cases)
- Added `btree_resident` field in `RowVersion` to track if
insert/deletion originated from a btree
**Testing**
- Added fuzz tests for `journal_mode` transitions with Database
operations in between
- Added integration tests for testing switching from the different modes
while checking the header version is correct
- Added some specific regression tests for delete operations lost on
mode switch
- Fixed `index_scan_compound_key_fuzz` to use separate databases for
Turso and SQLite in MVCC mode. Also had to decrease number of rows for
MVCC test, as insert was very slow.
# TODO's
- Remove sync hacks from `op_journal_mode`
- Expand fuzzer with different queries
- Add to Simulator
- Special handling for read only databases and not allow any header
changes
# Motivation and context
Facilitate our users to test MVCC and transition back and forth from it.
# AI Disclosure
Used AI to catch and fix bugs in MVCC, further my understanding with
MVCC, write tests in `tests` folder, most of the PR summary, and the
docs in the `docs/manual.md` file

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4074
2025-12-17 21:03:56 +02:00
Jussi Saurio
41c3dcd2ed fix: update schema if DDL commit succeeded, checkpoint failed
This is already done in abort() but if the checkpoint error is
returned directly from commit_tx() instead of in an IO completion,
then the Tx state will be cleared and the DDL change won't be applied.
So, do it explicitly in step_end_write_txn() too.

Fixes this failing seed:

cargo run -- --memory-io --seed 12902559987470385066 --disable-heuristic-shrinking
2025-12-17 17:16:43 +02:00
pedrocarlo
257dc5ad09 do not initiate a write transaction for journal mode + checkpoint before changing mode 2025-12-17 10:55:24 -03:00
pedrocarlo
323f1152d8 emit Checkpoint when setting new journal mode + adjust init code to correctly open the mv store 2025-12-17 10:55:24 -03:00
pedrocarlo
6541d2997f make Version an enum for better comparison 2025-12-17 10:49:25 -03:00
pedrocarlo
00cbae8412 modify op_journal_mode to parse the correct jounral mode and modify the header 2025-12-17 10:49:25 -03:00
Pere Diaz Bou
77841042d0
Merge 'Consider Order by expressions collation when deciding candidate index for iteration' from Pedro Muniz
## Description
Does solve #4154, but I don't want to close it with this PR, because it
does not solve the Affinity issue.
We can only use an index to iterate over if the column collation in the
order by clause matches the index collation
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Fix a bug in the optimizer
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Used AI to write tests, fuzzers, and help me understand the optimizer
code.
Test prompt:
<details>
can you write tests in tcl that test that the correct collation sequence
is properly maintained.
```
CREATE TABLE "t1" ("c1" TEXT COLLATE RTRIM);
INSERT INTO "t1" VALUES (' ');
CREATE INDEX "i1" ON "t1" ("c1" COLLATE RTRIM DESC);
INSERT INTO "t1" VALUES (1025.1655084065987);
SELECT "c1", typeof(c1) FROM "t1" ORDER BY "c1" COLLATE BINARY DESC, rowid ASC; 
```
this is an example of a query that returned incorrect results because of
this
</details>
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4248
2025-12-17 14:26:25 +01:00
Pere Diaz Bou
3b470252ae
Merge 'Checkpoint cleanup' from Jussi Saurio
## Beef
Centralize all checkpointing into a single state machine.
- Get rid of `wal_checkpoint_start`, `wal_checkpoint_finish`,
`wal_checkpoint`
- Rename `wal_checkpoint` to `blocking_checkpoint`
- Handle truncation and fsyncing in the single checkpoint state machine
and remove separate fsyncing steps from `commit_dirty_pages_inner` -
essentially we were fsyncing the DB file twice
## Review guide
Commit by commit is useful. I generated and cleaned up a larger single
commit with Opus first and then asked it to restructure it as a series
of atomic commits that sum up to an identical diff. The last commit is a
manual fix.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #4255
2025-12-17 13:23:27 +01:00
Pere Diaz Bou
f397481039
Merge 'use cmath from system libraries only in tests in order to be more portable' from Nikita Sivukhin
- not all systems has cmath functions which we import from system
libraries
- let's use external implementation only in tests in order to eliminate
precision errors in the differential tests
- https://discord.com/channels/933071162680958986/933071163184283651/145
0476358005293147

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #4246
2025-12-17 13:22:57 +01:00
Pere Diaz Bou
8b0ec66fe6
Merge 'fix(core/util): reject integer primary key underflow' from Nuno Gonçalves
## Description
This PR fixes underflow issue when updating integer primary keys,
aligning with SQLite's `datatype mismatch` behavior. Previously, updates
would silently reuse the same id in release mode and panic in debug
mode. This PR also changes the `datatype mismatch` errors from parse
errors to runtime errors to match SQLite behavior.
```
turso> create table t(a integer primary key);
turso> insert into t values (-9223372036854775808);
turso> update t set a = a - 1;
  × Runtime error: datatype mismatch
```
Additionally, this PR introduces a second commit to fix integer/float
equivalence in numeric string comparisons. Although this change is not
related to the PR, it is necessary for the newly introduced unit tests
in `core/util` to pass, as existing tests are failing on `master`.
## Motivation and context
Fixes #3848.
Turso's current behavior:
```
turso> create table t(a integer primary key);
turso> insert into t values (-9223372036854775808);
turso> update t set a = a - 1; -- this can be repeated many times, it's ignored
turso> select * from t;
┌──────────────────────┐
│ a                    │
├──────────────────────┤
│ -9223372036854775808 │
└──────────────────────┘
```
Expected behavior:
```
sqlite> create table t(a integer primary key);
sqlite> insert into t values (-9223372036854775808);
sqlite> update t set a = a - 1;
Runtime error: datatype mismatch (20)
```
## Description of AI Usage
GPT-5.1 Codex Max helped identify the underflow path.

Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>

Closes #4251
2025-12-17 13:22:16 +01:00
Jussi Saurio
cf145e5ec5 Remove unnecessary loop from op_checkpoint 2025-12-17 12:31:14 +02:00
Nikita Sivukhin
5f7c432c46 Merge branch 'main' into use-cmath-only-in-tests 2025-12-17 14:31:11 +04:00
Jussi Saurio
77e70f295d refactor(vdbe): simplify op_checkpoint to use new checkpoint() API
- Remove OpCheckpointState enum (no longer needed)
- Remove op_checkpoint_inner function
- Simplify op_checkpoint to call checkpoint() directly
- Remove op_checkpoint_state field from ProgramState
- Fix MVCC path to properly return after completing (was using break)

The checkpoint() function now handles all the state machine logic internally,
so the VDBE op just needs to call it and handle the result.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 12:29:19 +02:00
Nikita Sivukhin
0d23d47d6a use same tempfile in sorter and hash join and use in-memory IO for wasm 2025-12-17 12:18:35 +04:00
pedrocarlo
398c82fdf1 clippy 2025-12-16 23:11:31 -03:00
Nuno Gonçalves
def244e380 fix(core/util): reject integer primary key underflow 2025-12-16 23:49:08 +00:00
pedrocarlo
feb30ff499 add explain description to open read 2025-12-16 17:31:41 -03:00
Jussi Saurio
4a57710640
fix/core: decouple autocheckpoint result from transaction durability
After flushing the WAL, the db may attempt an automatic checkpoint. Previously,
a checkpoint failure for any other reason than `Busy` rolled back the transaction;
this is incorrect because the WAL has already been written and synced, so the transaction
is actually committed and durable.

For this reason, do the following decoupling:

- Mark the transaction as committed after `SyncWal`
- Any errors beyond that point are wrapped as `LimboError::CheckpointFailed` which is
  handled specially:
    * Tx rollback is not attempted in `abort()` - only the WAL locks are cleaned up
      and checkpoint state machines are reset
    * In the simulator, the results of the query are shadowed into the sim environment
      so that the sim correctly assumes that the transaction's effects were in fact
      committed.
2025-12-16 10:42:48 -05:00
Nikita Sivukhin
29cb069e74 fix clippy 2025-12-16 17:34:53 +04:00
Nikita Sivukhin
a494bf25ce use cmath from system libraries only in tests in order to be more portable
- 1450476358
2025-12-16 17:31:00 +04:00
Pekka Enberg
0429f7fe0e
Merge 'core: Don't rollback transaction when schema updated' from Pekka Enberg
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.
Spotted by Whopper.

Closes #4220
2025-12-16 10:12:01 +02:00
Preston Thorpe
1bcb14f9c5
Merge 'feat(hash-join): add hash matching for equivalent integer and real values' from Nuno Gonçalves
## Description
This PR adds hash matching for equivalent integer and real values in
hash joins. This is achieved by ensuring that integer/real equivalents
(including signed zeros) share the same hash in internal bloom filters
and hash tables.
```
turso> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
┌───┬────┬───┬──────┐
│ a │ b  │ a │ c    │
├───┼────┼───┼──────┤
│ 1 │    │   │      │
├───┼────┼───┼──────┤
│ 2 │ 10 │ 1 │ 10.0 │
└───┴────┴───┴──────┘
```
## Motivation and context
This change fixes the `LEFT JOIN` mismatch reported in #4147, where
joins on numerically equal `INTEGER` and `REAL` values failed in Turso
but succeeded in SQLite:
```
turso> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
┌───┬────┬───┬───┐
│ a │ b  │ a │ c │
├───┼────┼───┼───┤
│ 1 │    │   │   │
├───┼────┼───┼───┤
│ 2 │ 10 │   │   │
└───┴────┴───┴───┘
```
```
sqlite> CREATE TABLE IF NOT EXISTS t1 (a INTEGER, b INTEGER);
sqlite> CREATE TABLE IF NOT EXISTS t2 (a INTEGER, c REAL);
sqlite> INSERT INTO t1 (a, b) VALUES (1, NULL), (2, 10);
sqlite> INSERT INTO t2 (a, c) VALUES (1, 10.0), (3, NULL);
sqlite> SELECT * FROM t1 LEFT JOIN t2 ON t1.b = t2.c;
1|||
2|10|1|10.0
```
Fixes #4147.
## Description of AI Usage
This PR was developed with assistance from GPT-5.1 Codex Max. The AI
helped analyze the hash join–related codebase (including bloom filters
and hash table implementations), identify the root cause of the issue,
and assist in writing the tests.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4226
2025-12-15 20:58:32 -05:00
Nuno Gonçalves
c58c710f17 feat(hash-join): add hash matching for equivalent integer and real values 2025-12-15 15:25:50 +00:00
Pere Diaz Bou
f69cee9dd9 core/mvcc/cursor: return previous max id 2025-12-15 16:15:36 +01:00
Pekka Enberg
98308415b4 core: Don't rollback transaction when schema updated
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.

Spotted by Whopper.
2025-12-15 13:49:21 +02:00
Jussi Saurio
9dbbb2e358
Merge 'Add script to run SQLancer against turso + fix some bugs found by doing so' from Jussi Saurio
## Beef
- Add `./scripts/run-sqlancer.sh` script to run
[SQLancer](https://github.com/sqlancer/sqlancer) using Turso's Java
bindings.
> SQLancer is a tool to automatically test Database Management Systems
(DBMSs) in order to find bugs in their implementation. That is, it finds
bugs in the code of the DBMS implementation, rather than in queries
written by the user. SQLancer has found hundreds of bugs in mature and
widely-known DBMSs.
- Fix some bugs that were already found by running it
## Reader's guide to the PR
- Commit by commit reviewing is probably best since the java bindings
changes, turso core bugfixes, and the sqlancer vibecode are all
separated into commits.
## AI Disclosure
Heavy Opus 4.5 vibecoding. I just started with `"This is Turso, the Rust
rewrite of SQlite. Let's investigate ways to run SQLancer against it"`,
and went from there.
I seriously have no idea if this is the least-effort way of doing it,
but it works, so I think that's a good enough start.

Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #4180
2025-12-11 23:38:26 +02:00
Jussi Saurio
6fe64c73ca fix coroutine panic: replace ended_coroutine Bitfield with vec 2025-12-11 23:37:19 +02:00
pedrocarlo
471231c09f Get mutable reference to table in Schema so we can modify it with
`Arc::make_mut`
2025-12-11 13:19:34 -03:00
Jussi Saurio
8134ddd961 fix instr function when pattern is the empty string 2025-12-11 17:18:11 +02:00
Jussi Saurio
5b86f6db7e core: change some panics to errors 2025-12-11 17:18:11 +02:00
Jussi Saurio
cd56cff745
Merge 'translate/optimizer: Finish implementing ANALYZE' from Preston Thorpe
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
This PR (mostly) finishes implementing support for `ANALYZE`
It also uses this newly available metadata to improve calculating the
join order.
### Example Queries:
Both the same query, different order:
<img width="757" height="928" alt="image" src="https://github.com/user-
attachments/assets/82edd3bc-ef33-4df0-833d-92106bf4c065" />
Previously, tursodb would have changed the build table when the query
was written with `users` on the RHS. Now that we have the metadata
available, we are able to determine that `products` should _always_ be
the build table for inner equijoin/hash join.
=======================
### AI disclosure
A lot of the emission code in `core/translate/analyze.rs` was written by
codex.
EDIT:  Opus 4.5 was monumental in the cost based optimization work here.
That remains to be seen whether or not it succeeded XD

Closes #4141
2025-12-10 19:35:37 +02:00
Jussi Saurio
64dba96c60
Merge 'initialize global header on bootstrap' from Pedro Muniz
On bootstrap just store the header but not flush it to disk. Only try to
flush it when we start an MVCC transaction. Also applied fix in
`OpenDup` where we should not wrap an ephemeral table with an MvCursor

Reviewed-by: Mikaël Francoeur (@LeMikaelF)
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4151
2025-12-10 19:04:23 +02:00