## Description
The PR title. `exec_rows` also does validation of outputs automatically
which is good practice for testing
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Better typing and don't have to constantly match on `turso_core::Value`
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## AI Disclosure
Ai did most of the migration
<!--
Please disclose if any LLM's were used in the creation of this PR and to
what extent,
to help maintainers properly review.
-->
Closes#4192
## Description
Closes#4146
## Motivation and context
panics are bad
## AI Disclosure
none used
Reviewed-by: Pedro Muniz (@pedrocarlo)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4177
## Description
Does solve #4154, but I don't want to close it with this PR, because it
does not solve the Affinity issue.
We can only use an index to iterate over if the column collation in the
order by clause matches the index collation
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Fix a bug in the optimizer
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Used AI to write tests, fuzzers, and help me understand the optimizer
code.
Test prompt:
<details>
can you write tests in tcl that test that the correct collation sequence
is properly maintained.
```
CREATE TABLE "t1" ("c1" TEXT COLLATE RTRIM);
INSERT INTO "t1" VALUES (' ');
CREATE INDEX "i1" ON "t1" ("c1" COLLATE RTRIM DESC);
INSERT INTO "t1" VALUES (1025.1655084065987);
SELECT "c1", typeof(c1) FROM "t1" ORDER BY "c1" COLLATE BINARY DESC, rowid ASC;
```
this is an example of a query that returned incorrect results because of
this
</details>
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4248
When the SchemaUpdated error occurs during statement execution, don't
roll back the transaction, but instead re-prepare the statement.
Spotted by Whopper.
Problem:
The existing DP-based join optimizer has O(2^n) complexity, which
causes large joins to basically not get past the planning phase.
Fix:
Add a greedy algorithm that runs in O(n²) time for >12 tables.
Details:
- Add compute_greedy_join_order() with hub score heuristic for
selecting the starting table. Tables referenced by many other
tables' constraints are preferred, enabling index lookups on
subsequent joins. This is especially good for star schema
queries.
- Add GREEDY_JOIN_THRESHOLD constant (12) for switchover point
- Add fuzz tests covering star schemas, chains, cliques up to 62
tables, and LEFT JOIN ordering invariants (RHS of a left join
cannot be reordered).
- Not all the tests necessarily assert that a query results in a
good plan (apart from star schemas), but all tests do assert
that we are _able_ to construct a plan (unlike before, where
even 32-way joins would grind to a halt).
AI usage:
- Pretty much all of this was a conversation between me and Opus 4.5.
I asked it to search the internet for practical solutions to the
problem and it suggested a simple greedy search as a low-complexity
solution and I thought it was a good idea for now.
On bootstrap just store the header but not flush it to disk. Only try to
flush it when we start an MVCC transaction. Also applied fix in
`OpenDup` where we should not wrap an ephemeral table with an MvCursor
Reviewed-by: Mikaël Francoeur (@LeMikaelF)
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4151
CTEs now work correctly when combined with UNION, UNION ALL, INTERSECT,
and EXCEPT.
**Before:**
```sql
WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x
-- Error: Parse error: no such table: t
```
**After:**
```sql
WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x
-- Works correctly, returns rows (1) and (2)
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4123
## Closes
- Closes#4017
- Addresses #4043; this now fails with `Page cache is full` with 100k
pages, which is a separate non-corruption issue. Modifying max page
cache size to be 10 million pages makes it not finish at all. We should
modify the issue after this is merged to reflect what the new problem
is. The queries in the issue (#4043) create a WAL that is at least 1.7
GB in size
## Background
We have an optimization in the btree where if:
- We want to reach the rightmost leaf page, and
- We know the rightmost page and are already on it
Then we can skip a seek.
## Problem
The problem is this optimization should NEVER be used in cases where we
cannot be sure that the btree wasn't modified from under us e.g. by a
trigger subprogram.
## Fix
Hence, disable it when we are executing a parent program that has
triggers which will fire.
## AI Disclosure
No AI was used for this PR.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4135
Closes#4066Closes#4129
## Problem
Take e.g.
CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc); INSERT INTO t
values (1),(2),(3);
SELECT * FROM t WHERE x > NULL;
--
Our plan, like Sqlite, was to start iterating the descending index from
the beginning (Rewind) and stop once we hit a row where x is <= than
NULL using `IdxGe` instruction (GE in descending indexes means LE).
However, `IdxGe` and other similar instructions use a sort comparison
where NULL is less than numbers/strings etc, so this would incorrectly
not jump.
## Fix
Fix: we need to emit an explicit NULL check after rewinding.
## Tests
Added TCL tests + improved `index_scan_compound_key_fuzz` to have NULL
seek keys sometimes.
## AI disclosure
I started debugging this with Claude Code thinking this is a much deeper
corruption issue, but Opus 4.5 noticed immediately that we are returning
rows from a `x > NULL` comparison which should never happen. Hence, the
fix was then fairly simple.
Closes#4132
Take e.g.
CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc);
INSERT INTO t values (1),(2),(3);
SELECT * FROM t WHERE x > NULL;
--
Our plan, like Sqlite, was to start iterating the descending index
from the beginning (Rewind) and stop once we hit a row where x is
<= than NULL using `IdxGe` instruction (GE in descending indexes
means LE).
However, `IdxGe` and other similar instructions use a sort comparison
where NULL is less than numbers/strings etc, so this would incorrectly
not jump.
Fix: we need to emit an explicit NULL check after rewinding.
Right now turso can panic with various asserts if 2 or more write
statements will be executed over single connection concurrently:
```
thread 'query_processing::test_write_path::api_misuse' panicked at core/storage/pager.rs:776:9:
subjournal offset should be 0
```
This PR adds explicit guard for subjournal access which will return
`Busy` for the operation internally and lead to wait condition for the
statement until subjournal ownership will be released and can be re-
acquired again.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4110
needed for #4063 to merge, currently passing on main but just want to
lower the already huge diff
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4103
- this is important for IN operation translation bug because in case of COUNT(*) there is constant assignment instruction right after last instruction translated from IN condition
for example, upon opening an existing database, all the rows are in
the btree, so if we seek only from MV store, we won't find anything.
ergo: we must look from both the mv store and the btree. if we are
iterating forwards, the smallest of the two results is where we land,
and vice versa for backwards iteration.
initially this implementation used blocking IO but was refactored to
use state machines after the rest of the Cursor methods in the MVCC cursor
module were refactored to do that too.
---
this PR was initially almost entirely written using Claude Code + Opus 4.5,
but heavily manually cleaned up as the AI made the state machine refactor
far too complicated.