Commit graph

2041 commits

Author SHA1 Message Date
Nuno Gonçalves
734ba9a1bf chore(format): cargo fmt 2025-12-21 17:15:11 +00:00
Nuno Gonçalves
1b386147c1 fix(core/translate): apply affinity conversion to hash join build and probe keys 2025-12-21 16:59:46 +00:00
Preston Thorpe
a10051dde4
Merge 'Fix RTRIM ignoring trailing tabs' from Krishna Vishal
`str::trim_end()` removes trailing tabs too. Replaced it with
`trim_end_matches(' ')`. Rust `str::trim` functions seem problematic
because they remove non-ascii whitespace and others.
Behavior now:
```
turso> select 'x' || char(9) = 'x' collate rtrim;
┌─────────────────────────────────────┐
│ 'x' || char (9) = 'x' COLLATE rtrim │
├─────────────────────────────────────┤
│                                   0 │
└─────────────────────────────────────┘
```
Closes: #3480

Reviewed-by: Mikaël Francoeur (@LeMikaelF)

Closes #3891
2025-12-20 10:33:59 -05:00
Nuno Gonçalves
21268632eb fix(core): prevent ALTER COLUMN from resulting in tables with only generated columns 2025-12-19 21:52:36 +00:00
pedrocarlo
2bc976f3d9 add proper name checks for create table, view 2025-12-19 17:20:46 -03:00
Jussi Saurio
37a8897bc6
Merge 'triggers: don't rewrite qualified table names' from Pavan Nambi
## Description
closes https://github.com/tursodatabase/turso/issues/4142
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
compatibility, we were  wrongly rewriting table qualified cols, also
added trigger.test to all.test and expect correct values in a test
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## AI Disclosure
None
<!--
Please disclose if any LLM's were used in the creation of this PR and to
what extent,
to help maintainers properly review.
-->

Closes #4206
2025-12-18 09:26:46 +02:00
pedrocarlo
257dc5ad09 do not initiate a write transaction for journal mode + checkpoint before changing mode 2025-12-17 10:55:24 -03:00
pedrocarlo
323f1152d8 emit Checkpoint when setting new journal mode + adjust init code to correctly open the mv store 2025-12-17 10:55:24 -03:00
pedrocarlo
6f67b05885 add JournalMode to parse the requested journal mode and handle changes in modes
when updating `journal_mode` open a Write Transaction to flush pages to
database file
2025-12-17 10:49:25 -03:00
pedrocarlo
398c82fdf1 clippy 2025-12-16 23:11:31 -03:00
pedrocarlo
0be4a885f1 adjust fuzz tests to account for collation and sort order for asserting correctness 2025-12-16 23:09:00 -03:00
pedrocarlo
ec2f9bd472 remove refcell for access method arena 2025-12-16 21:24:36 -03:00
pedrocarlo
fefb999a60 adjust optimizer to have similar semantics to SQLite, and only remove
the index if sort is eliminated
2025-12-16 18:38:00 -03:00
pedrocarlo
5b8254fb51 add order target to constraints_from_where_clause + remove candidate index if order by column collation does not match index collation 2025-12-16 15:04:58 -03:00
Pavan-Nambi
ac7d3597ff
don't rewrite qualified table names 2025-12-16 16:45:40 +05:30
Pekka Enberg
df9c39cd1c core: Fix integrity_check pragma code generation
The translate_integrity_check function was missing a call to
add_pragma_result_column, causing num_columns() to return 0.
This made the Python bindings treat it as a non-row-returning
statement, finalizing it during execute() and leaving fetchone()
to return None instead of ("ok",).

Spotted by Antithesis.
2025-12-15 09:46:46 +02:00
Jussi Saurio
9dbbb2e358
Merge 'Add script to run SQLancer against turso + fix some bugs found by doing so' from Jussi Saurio
## Beef
- Add `./scripts/run-sqlancer.sh` script to run
[SQLancer](https://github.com/sqlancer/sqlancer) using Turso's Java
bindings.
> SQLancer is a tool to automatically test Database Management Systems
(DBMSs) in order to find bugs in their implementation. That is, it finds
bugs in the code of the DBMS implementation, rather than in queries
written by the user. SQLancer has found hundreds of bugs in mature and
widely-known DBMSs.
- Fix some bugs that were already found by running it
## Reader's guide to the PR
- Commit by commit reviewing is probably best since the java bindings
changes, turso core bugfixes, and the sqlancer vibecode are all
separated into commits.
## AI Disclosure
Heavy Opus 4.5 vibecoding. I just started with `"This is Turso, the Rust
rewrite of SQlite. Let's investigate ways to run SQLancer against it"`,
and went from there.
I seriously have no idea if this is the least-effort way of doing it,
but it works, so I think that's a good enough start.

Reviewed-by: Preston Thorpe <preston@turso.tech>
Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #4180
2025-12-11 23:38:26 +02:00
Jussi Saurio
82fbc3e0fa fix: INSERT OR IGNORE with NOT NULL constraint failure 2025-12-11 17:18:11 +02:00
Jussi Saurio
5b86f6db7e core: change some panics to errors 2025-12-11 17:18:11 +02:00
Jussi Saurio
faa1197e58 Add greedy join ordering for large queries (>12 tables)
Problem:

The existing DP-based join optimizer has O(2^n) complexity, which
causes large joins to basically not get past the planning phase.

Fix:

Add a greedy algorithm that runs in O(n²) time for >12 tables.

Details:

- Add compute_greedy_join_order() with hub score heuristic for
  selecting the starting table. Tables referenced by many other
  tables' constraints are preferred, enabling index lookups on
  subsequent joins. This is especially good for star schema
  queries.
- Add GREEDY_JOIN_THRESHOLD constant (12) for switchover point
- Add fuzz tests covering star schemas, chains, cliques up to 62
  tables, and LEFT JOIN ordering invariants (RHS of a left join
  cannot be reordered).
- Not all the tests necessarily assert that a query results in a
  good plan (apart from star schemas), but all tests do assert
  that we are _able_ to construct a plan (unlike before, where
  even 32-way joins would grind to a halt).

AI usage:

- Pretty much all of this was a conversation between me and Opus 4.5.
  I asked it to search the internet for practical solutions to the
  problem and it suggested a simple greedy search as a low-complexity
  solution and I thought it was a good idea for now.
2025-12-11 09:31:38 +02:00
Jussi Saurio
cd56cff745
Merge 'translate/optimizer: Finish implementing ANALYZE' from Preston Thorpe
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
This PR (mostly) finishes implementing support for `ANALYZE`
It also uses this newly available metadata to improve calculating the
join order.
### Example Queries:
Both the same query, different order:
<img width="757" height="928" alt="image" src="https://github.com/user-
attachments/assets/82edd3bc-ef33-4df0-833d-92106bf4c065" />
Previously, tursodb would have changed the build table when the query
was written with `users` on the RHS. Now that we have the metadata
available, we are able to determine that `products` should _always_ be
the build table for inner equijoin/hash join.
=======================
### AI disclosure
A lot of the emission code in `core/translate/analyze.rs` was written by
codex.
EDIT:  Opus 4.5 was monumental in the cost based optimization work here.
That remains to be seen whether or not it succeeded XD

Closes #4141
2025-12-10 19:35:37 +02:00
Jussi Saurio
0d35366f5d
Merge 'Fix CTE scope propagation for compound SELECTs' from Martin Mauch
CTEs now work correctly when combined with UNION, UNION ALL, INTERSECT,
and EXCEPT.
**Before:**
```sql
WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x
-- Error: Parse error: no such table: t
```
**After:**
```sql
WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x
-- Works correctly, returns rows (1) and (2)
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4123
2025-12-10 19:04:03 +02:00
Nikita Sivukhin
df2e72e2c5 do not use index range search as access method if index collation differs from table column collation 2025-12-10 14:53:32 +04:00
Jussi Saurio
021aced898 use RowCountEstimate enum to distinguish between ANALYZE stats and fallback row counts 2025-12-10 12:42:38 +02:00
Jussi Saurio
eecf51b140 optimizer: remove penalization heuristics
these are not necessary as long as the cost model works well enough,
which it does after a few tweaks to selectivities from the commits
that precede this one.
2025-12-10 12:23:52 +02:00
Jussi Saurio
4a00c32f44 optimizer: add separate fallback selectivities for indexed/unindexed equalities
heuristic: indexed columns are likely to be more selective because users are
likely to create indexes on selective columns.
2025-12-10 12:22:43 +02:00
Jussi Saurio
9821d9a7ca optimizer: fix selectivity estimate on unique index
UNIQUE indexes will by definition return max 1 row, but we were
falling back to our SELECTIVITY_EQ_FALLBACK for them. This commit
fixes that.
2025-12-10 12:21:31 +02:00
PThorpe92
0832637f6f
Fix tests by requiring usable constraints to mark rowid access method as covering in optimizer 2025-12-09 21:20:19 -05:00
PThorpe92
bb7b8aa935
Add/expand tcl tests for analyze 2025-12-09 20:02:07 -05:00
PThorpe92
9268bb6d39
Add scalar functions to properly support analyze behavior 2025-12-09 19:45:06 -05:00
PThorpe92
997792f6de
Improve query optimizer to work better with analyze stats 2025-12-09 15:11:50 -05:00
PThorpe92
298bb3caf7
Update stats refresh in core/lib 2025-12-09 14:46:12 -05:00
PThorpe92
f5b5449103
Support ANALYZE main as identical to plain ANALYZE 2025-12-09 14:46:12 -05:00
PThorpe92
8a69a238c9
Enable ANALYZE with no argument/full analyze 2025-12-09 14:46:12 -05:00
PThorpe92
8355217064
Use new schema.analyze_stats for join ordering cost analysis 2025-12-09 14:46:11 -05:00
PThorpe92
b002350a8e
work on finishing ANALYZE impl 2025-12-09 14:46:11 -05:00
Preston Thorpe
9e75e23eb6
Merge 'feat: adding check for unquoted literals in values()' from Rohith Suresh
Some checks are pending
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Python / configure-strategy (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Closes #3949
Closes #4018
```
turso> values(asdf);
  × Parse error: Unquoted identifier in VALUES clause: asdf
  
turso> select * from users;
┌────┬──────┐
│ id │ name │
├────┼──────┤
│  1 │ jack │
├────┼──────┤
│  2 │ jill │
└────┴──────┘
  turso> select id, (values(name)) as name_again from users;
┌────┬────────────┐
│ id │ name_again │
├────┼────────────┤
│  1 │ jack       │
├────┼────────────┤
│  2 │ jill       │
└────┴────────────┘
```

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4003
2025-12-09 10:15:11 -05:00
Martin Mauch
3dc8fed204 Fix CTE scope propagation for compound SELECTs 2025-12-09 13:09:55 +01:00
Jussi Saurio
2e8b771f6f
Merge 'Fix descending index scan returning rows when seek key is NULL' from Jussi Saurio
Closes #4066
Closes #4129
## Problem
Take e.g.
CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc); INSERT INTO t
values (1),(2),(3);
SELECT * FROM t WHERE x > NULL;
--
Our plan, like Sqlite, was to start iterating the descending index from
the beginning (Rewind) and stop once we hit a row where x is <= than
NULL using `IdxGe` instruction (GE in descending indexes means LE).
However, `IdxGe` and other similar instructions use a sort comparison
where NULL is less than numbers/strings etc, so this would incorrectly
not jump.
## Fix
Fix: we need to emit an explicit NULL check after rewinding.
## Tests
Added TCL tests + improved `index_scan_compound_key_fuzz` to have NULL
seek keys sometimes.
## AI disclosure
I started debugging this with Claude Code thinking this is a much deeper
corruption issue, but Opus 4.5 noticed immediately that we are returning
rows from a `x > NULL` comparison which should never happen. Hence, the
fix was then fairly simple.

Closes #4132
2025-12-09 09:38:18 +02:00
Jussi Saurio
027ebe33fe Fix descending index scan returning rows when seek key is NULL
Take e.g.

CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc);
INSERT INTO t values (1),(2),(3);

SELECT * FROM t WHERE x > NULL;

--

Our plan, like Sqlite, was to start iterating the descending index
from the beginning (Rewind) and stop once we hit a row where x is
<= than NULL using `IdxGe` instruction (GE in descending indexes
means LE).

However, `IdxGe` and other similar instructions use a sort comparison
where NULL is less than numbers/strings etc, so this would incorrectly
not jump.

Fix: we need to emit an explicit NULL check after rewinding.
2025-12-08 13:19:58 +02:00
Jussi Saurio
826ca4d44d chore: remove experimental_indexes feature flags 2025-12-08 13:00:37 +02:00
Jussi Saurio
40898b8cf9
Merge 'Added dot product vector distance' from Tejas
Closes #3826
This PR adds dot product distance implementation.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4055
2025-12-08 06:24:39 +02:00
Jussi Saurio
33db726f37
Merge 'Run BEFORE and AFTER update triggers on upserts' from Mikaël Francoeur
After this fix, I ran the fuzz test for more than an hour with no
issues.
Closes https://github.com/tursodatabase/turso/issues/4075
## AI disclosure
Claude wrote the implementation and tests from just a copy/paste of the
Github issue.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4119
2025-12-08 06:21:28 +02:00
Mikaël Francoeur
dc3cd84a70
Prevent creating index on rowid pseudo-column
SQLite rejects `CREATE INDEX idx ON t(rowid)` with "no such column: rowid"
because rowid is a pseudo-column, not an actual column. Limbo was
incorrectly allowing this.

The fix removes the special exception for ROWID_STRS (rowid, _rowid_, oid)
in validate_index_expression(). Now these identifiers are only allowed
if they match an actual column name in the table (i.e., when shadowed).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 16:10:51 -05:00
Mikaël Francoeur
d53fc5a6e2
run BEFORE and AFTER update triggers on upserts 2025-12-06 15:20:24 -05:00
Tejas
06ecbbfc87
Merge branch 'tursodatabase:main' into dot-product-distance 2025-12-05 23:58:34 +05:30
Jussi Saurio
e0791406b5 Fix two bugs with compound selects
1. Previously compound select would not start a transaction if
   the rightmost subselect had no table references, e.g.
   SELECT * FROM t UNION VALUES(1)

2. Previously the column names for the query were taken from the
   rightmost subselect - instead, they should be taken from the
   leftmost subselect.
2025-12-05 17:14:58 +02:00
Nikita Sivukhin
17695438ac fix IN operator translation bug
- following query translated with bug before the fix:

turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');

- before emit_constant_insns optimization query plan was correct

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     12    0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       String8          0     4     0     alice          0   r[4]='alice'
6       Eq               3     4     9     Binary         0   if r[3]==r[4] goto 9
7       String8          0     5     0     bob            0   r[5]='bob'
8       Ne               3     5     11    Binary         0   if r[3]!=r[5] goto 11
9       Integer          1     6     0                    0   r[6]=1
10      AggStep          0     6     2     count          0   accum=r[2] step(r[6])
11    Next               0     4     0                    0
12    AggFinal           0     2     0     count          0   accum=r[2]
13    Copy               2     1     0                    0   r[1]=r[2]
14    ResultRow          1     1     0                    0   output=r[1]
15    Halt               0     0     0                    0
16    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
17    Goto               0     1     0                    0

- but the problem is that after emit_constant_insns Eq jump target was rewritten as it was binded to the 9th op code (Integer 1) instead of the next op code after the IN translated block (in the final plan, note jump to the 16 address for Eq instruction)

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     9     0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       Eq               3     4     16    Binary         0   if r[3]==r[4] goto 16
6       Ne               3     5     8     Binary         0   if r[3]!=r[5] goto 8
7       AggStep          0     6     2     count          0   accum=r[2] step(r[6])
8     Next               0     4     0                    0
9     AggFinal           0     2     0     count          0   accum=r[2]
10    Copy               2     1     0                    0   r[1]=r[2]
11    ResultRow          1     1     0                    0   output=r[1]
12    Halt               0     0     0                    0
13    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
14    String8            0     4     0     alice          0   r[4]='alice'
15    String8            0     5     0     bob            0   r[5]='bob'
16    Integer            1     6     0                    0   r[6]=1
17    Goto               0     1     0                    0
2025-12-05 14:52:09 +04:00
PThorpe92
170ed26732
Fix NULL handling of hash join comparison and bug in cursor override logic 2025-12-04 17:21:46 -05:00
PThorpe92
d0f15d1537
Fix bloom filter impl to skip NULL values 2025-12-04 16:09:49 -05:00