Jussi Saurio
c7672b952b
Use Cow for Value::Blob to prevent copies in op_column
2025-12-06 12:08:27 +02:00
Jussi Saurio
2742278e5a
decouple functions from struct Value
2025-12-06 10:42:01 +02:00
Preston Thorpe
c09c30746e
Merge 'guard subjournal access within single connection' from Nikita Sivukhin
...
Right now turso can panic with various asserts if 2 or more write
statements will be executed over single connection concurrently:
```
thread 'query_processing::test_write_path::api_misuse' panicked at core/storage/pager.rs:776:9:
subjournal offset should be 0
```
This PR adds explicit guard for subjournal access which will return
`Busy` for the operation internally and lead to wait condition for the
statement until subjournal ownership will be released and can be re-
acquired again.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes #4110
2025-12-05 13:14:07 -05:00
Nikita Sivukhin
659ef7c079
fix clippy
2025-12-05 21:39:35 +04:00
Nikita Sivukhin
487854e6d6
guard subjournal access in order to prevent concurrent operations over it within same connection
2025-12-05 21:25:13 +04:00
Jussi Saurio
e0791406b5
Fix two bugs with compound selects
...
1. Previously compound select would not start a transaction if
the rightmost subselect had no table references, e.g.
SELECT * FROM t UNION VALUES(1)
2. Previously the column names for the query were taken from the
rightmost subselect - instead, they should be taken from the
leftmost subselect.
2025-12-05 17:14:58 +02:00
Jussi Saurio
39325fdca9
Merge 'Subsec failed to format as YYYY-MM-DD HH:MM:SS.SSS' from
...
Sqlite docs reference: https://sqlite.org/lang_datefunc.html
The datetime() function returns the date and time formatted as YYYY-MM-
DD HH:MM:SS or as YYYY-MM-DD HH:MM:SS.SSS if the [subsec
modifier](https://sqlite.org/lang_datefunc.html#subsec ) is used.
### Failed before changes:
``` sql
INSERT INTO test_results VALUES (
'Fixed DateTime Expansion',
datetime('2024-01-01 12:00:00', 'subsec'),
'2024-01-01 12:00:00.000',
CASE
WHEN datetime('2024-01-01 12:00:00', 'subsec') = '2024-01-01 12:00:00.000' THEN 'PASS'
ELSE 'FAIL'
END,
'Adds .000 to fixed time'
);
```
<img width="1164" height="94" alt="Screenshot 2025-12-04 130827"
src="https://github.com/user-
attachments/assets/d09dbe19-d329-4b88-a727-8f92e79e0a10" />
``` sql
INSERT INTO test_results VALUES (
'Date Only Input',
datetime('2024-01-01', 'subsec'),
'2024-01-01 00:00:00.000',
CASE WHEN datetime('2024-01-01', 'subsec') = '2024-01-01 00:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
'Expands date to midnight.000'
);
INSERT INTO test_results VALUES (
'ISO "T" Separator',
datetime('2024-01-01T15:30:00', 'subsec'),
'2024-01-01 15:30:00.000',
CASE WHEN datetime('2024-01-01T15:30:00', 'subsec') = '2024-01-01 15:30:00.000' THEN 'PASS' ELSE 'FAIL' END,
'Replaces T with space'
);
INSERT INTO test_results VALUES (
'Chain: Subsec then Add',
datetime('2024-01-01 12:00:00', 'subsec', '+1 hour'),
'2024-01-01 13:00:00.000',
CASE WHEN datetime('2024-01-01 12:00:00', 'subsec', '+1 hour') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
'Flag persists after +1 hour'
);
INSERT INTO test_results VALUES (
'Chain: Add then Subsec',
datetime('2024-01-01 12:00:00', '+1 hour', 'subsec'),
'2024-01-01 13:00:00.000',
CASE WHEN datetime('2024-01-01 12:00:00', '+1 hour', 'subsec') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
'Standard order works'
);
INSERT INTO test_results VALUES (
'Case Insensitivity',
datetime('2024-01-01 12:00:00', 'SuBsEc'),
'2024-01-01 12:00:00.000',
CASE WHEN datetime('2024-01-01 12:00:00', 'SuBsEc') = '2024-01-01 12:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
'SuBsEc works'
);
```
<img width="1206" height="340" alt="Screenshot 2025-12-04 131114"
src="https://github.com/user-
attachments/assets/72fd7fe2-c991-4a92-b3d7-2f4516404fa7" />
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes #4097
2025-12-05 15:31:56 +02:00
Jussi Saurio
ca605b368b
Merge 'Fix IN operator translation logic' from Nikita Sivukhin
...
This PR replaces incorrect usage of `program.resolve_label(...)` to the
correct method `program.preassign_label_to_next_insn(...)` for IN
operator translation code.
Following query translated with bug before the fix:
```sql
turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
```
Before `emit_constant_insns` optimization query plan was correct
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 16 0 0 Start at 16
1 Null 0 2 0 0 r[2]=NULL
2 OpenRead 0 2 0 0 table=t, root=2, iDb=0
3 Rewind 0 12 0 0 Rewind table t
4 Column 0 1 3 0 r[3]=t.name
5 String8 0 4 0 alice 0 r[4]='alice'
6 Eq 3 4 9 Binary 0 if r[3]==r[4] goto 9
7 String8 0 5 0 bob 0 r[5]='bob'
8 Ne 3 5 11 Binary 0 if r[3]!=r[5] goto 11
9 Integer 1 6 0 0 r[6]=1
10 AggStep 0 6 2 count 0 accum=r[2] step(r[6])
11 Next 0 4 0 0
12 AggFinal 0 2 0 count 0 accum=r[2]
13 Copy 2 1 0 0 r[1]=r[2]
14 ResultRow 1 1 0 0 output=r[1]
15 Halt 0 0 0 0
16 Transaction 0 1 1 0 iDb=0 tx_mode=Read
17 Goto 0 1 0 0
```
But the problem is that after `emit_constant_insns`, `Eq` jump target
was rewritten as it was binded to the 9th op code (Integer 1) instead of
the next op code after the IN translated block (in the final plan, note
jump to the 16 address for Eq instruction)
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 13 0 0 Start at 13
1 Null 0 2 0 0 r[2]=NULL
2 OpenRead 0 2 0 0 table=t, root=2, iDb=0
3 Rewind 0 9 0 0 Rewind table t
4 Column 0 1 3 0 r[3]=t.name
5 Eq 3 4 16 Binary 0 if r[3]==r[4] goto 16
6 Ne 3 5 8 Binary 0 if r[3]!=r[5] goto 8
7 AggStep 0 6 2 count 0 accum=r[2] step(r[6])
8 Next 0 4 0 0
9 AggFinal 0 2 0 count 0 accum=r[2]
10 Copy 2 1 0 0 r[1]=r[2]
11 ResultRow 1 1 0 0 output=r[1]
12 Halt 0 0 0 0
13 Transaction 0 1 1 0 iDb=0 tx_mode=Read
14 String8 0 4 0 alice 0 r[4]='alice'
15 String8 0 5 0 bob 0 r[5]='bob'
16 Integer 1 6 0 0 r[6]=1
17 Goto 0 1 0 0
```
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes #4105
2025-12-05 15:31:32 +02:00
Jussi Saurio
74296e52bb
Merge 'Automatically Propagate Encryption options' from Pedro Muniz
...
On database open, we store the Encryption Options and pass them onwards
to the Connection, Pager and Wal. We also have slight gain in
ergonomics, as we don't have set the Pragma's for the `cipher` and
`hexkey` on each new `Connection`.
I needed this logic, because I will need to initialize a Default Header
for empty DBs and encryption opts not being automatically propagated was
hindering me for this.
**Ai Disclosure**
Claude helped me debug and find out issues in my implementation
cc @avinassh
Reviewed-by: Avinash Sajjanshetty (@avinassh)
Closes #4100
2025-12-05 15:31:17 +02:00
Nikita Sivukhin
17695438ac
fix IN operator translation bug
...
- following query translated with bug before the fix:
turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
- before emit_constant_insns optimization query plan was correct
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 16 0 0 Start at 16
1 Null 0 2 0 0 r[2]=NULL
2 OpenRead 0 2 0 0 table=t, root=2, iDb=0
3 Rewind 0 12 0 0 Rewind table t
4 Column 0 1 3 0 r[3]=t.name
5 String8 0 4 0 alice 0 r[4]='alice'
6 Eq 3 4 9 Binary 0 if r[3]==r[4] goto 9
7 String8 0 5 0 bob 0 r[5]='bob'
8 Ne 3 5 11 Binary 0 if r[3]!=r[5] goto 11
9 Integer 1 6 0 0 r[6]=1
10 AggStep 0 6 2 count 0 accum=r[2] step(r[6])
11 Next 0 4 0 0
12 AggFinal 0 2 0 count 0 accum=r[2]
13 Copy 2 1 0 0 r[1]=r[2]
14 ResultRow 1 1 0 0 output=r[1]
15 Halt 0 0 0 0
16 Transaction 0 1 1 0 iDb=0 tx_mode=Read
17 Goto 0 1 0 0
- but the problem is that after emit_constant_insns Eq jump target was rewritten as it was binded to the 9th op code (Integer 1) instead of the next op code after the IN translated block (in the final plan, note jump to the 16 address for Eq instruction)
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr opcode p1 p2 p3 p4 p5 comment
---- ----------------- ---- ---- ---- ------------- -- -------
0 Init 0 13 0 0 Start at 13
1 Null 0 2 0 0 r[2]=NULL
2 OpenRead 0 2 0 0 table=t, root=2, iDb=0
3 Rewind 0 9 0 0 Rewind table t
4 Column 0 1 3 0 r[3]=t.name
5 Eq 3 4 16 Binary 0 if r[3]==r[4] goto 16
6 Ne 3 5 8 Binary 0 if r[3]!=r[5] goto 8
7 AggStep 0 6 2 count 0 accum=r[2] step(r[6])
8 Next 0 4 0 0
9 AggFinal 0 2 0 count 0 accum=r[2]
10 Copy 2 1 0 0 r[1]=r[2]
11 ResultRow 1 1 0 0 output=r[1]
12 Halt 0 0 0 0
13 Transaction 0 1 1 0 iDb=0 tx_mode=Read
14 String8 0 4 0 alice 0 r[4]='alice'
15 String8 0 5 0 bob 0 r[5]='bob'
16 Integer 1 6 0 0 r[6]=1
17 Goto 0 1 0 0
2025-12-05 14:52:09 +04:00
Jussi Saurio
eb782ce2d4
fix/mvcc: seek() must seek from both mv store and btree
...
for example, upon opening an existing database, all the rows are in
the btree, so if we seek only from MV store, we won't find anything.
ergo: we must look from both the mv store and the btree. if we are
iterating forwards, the smallest of the two results is where we land,
and vice versa for backwards iteration.
initially this implementation used blocking IO but was refactored to
use state machines after the rest of the Cursor methods in the MVCC cursor
module were refactored to do that too.
---
this PR was initially almost entirely written using Claude Code + Opus 4.5,
but heavily manually cleaned up as the AI made the state machine refactor
far too complicated.
2025-12-05 11:53:16 +02:00
Jussi Saurio
8f57c60c26
Merge 'core/mvcc: state machines for prev, next, exists, rewind, last' from Pere Diaz Bou
...
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / lint (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / Release (push) Blocked by required conditions
Python / sdist (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Basically bring back state machines for: `next`, `prev`, `last`,
`rewind`, `get_next_rowid`.
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes #4095
2025-12-05 10:08:00 +02:00
pedrocarlo
ee73bab743
get correct reserved bytes if Cipher is not None
2025-12-05 02:04:06 -03:00
pedrocarlo
a311c966a2
set encryption context for page and wal in init_pager
2025-12-05 02:04:06 -03:00
pedrocarlo
889322f6b5
do not call pragmas related to encryption on connect or open
2025-12-05 02:04:06 -03:00
pedrocarlo
0118a65169
pass encryption opts from the database to the connection on connect
2025-12-05 02:04:06 -03:00
pedrocarlo
85b212056d
separate init function for connect
2025-12-05 02:04:06 -03:00
pedrocarlo
1a43de35ce
add encryption key and cipher to Database struct
2025-12-05 02:04:06 -03:00
pedrocarlo
faca85de2f
pass pager to _connect and share initial coon for boostrapping mvcc
2025-12-05 02:04:05 -03:00
PThorpe92
170ed26732
Fix NULL handling of hash join comparison and bug in cursor override logic
2025-12-04 17:21:46 -05:00
PThorpe92
d0f15d1537
Fix bloom filter impl to skip NULL values
2025-12-04 16:09:49 -05:00
PThorpe92
4225bec5e9
Apply some review suggestions
2025-12-04 16:09:49 -05:00
PThorpe92
f3affbba2b
Fix condition evaluation that references hash build table now that its no longer in join order
2025-12-04 16:09:49 -05:00
PThorpe92
e8ef5dbf06
Cache result column values in the hashtable to prevent additional SeekRowID
2025-12-04 16:09:48 -05:00
PThorpe92
ad7d34bb67
Make hash joins follow pattern of ephemeral indexes instead of hoisting special logic
2025-12-04 16:09:48 -05:00
PThorpe92
eebb4950c6
Apply some review suggestions/cleanups
2025-12-04 16:09:48 -05:00
PThorpe92
866b153a7a
Fix clippy warnings
2025-12-04 16:09:47 -05:00
PThorpe92
746fe82159
Add/update some comments in the query planner
2025-12-04 16:09:47 -05:00
PThorpe92
bf61feb6bc
Fix hash join planning/emission for > 2 way joins
2025-12-04 16:09:47 -05:00
PThorpe92
fd93544af3
Remove unneeded label resolution in close_loop method for hash join build tbl
2025-12-04 16:09:46 -05:00
PThorpe92
9b274d9243
Prevent using hash join for any query with an index on a join key
2025-12-04 16:09:46 -05:00
PThorpe92
bc5406be04
Add memory py tests for hash spilling, fix/finish spilling impl in hashtable
2025-12-04 16:09:46 -05:00
PThorpe92
bea9648a98
Fix hash join spilling for async IO by keeping buffer alive in callback
2025-12-04 16:09:45 -05:00
PThorpe92
0da294c915
Implement spilling to disk for hash joins
2025-12-04 16:09:45 -05:00
PThorpe92
f6e678c133
Fix cardinality in selecting jash join in optimizer
2025-12-04 16:09:45 -05:00
PThorpe92
5c1698da45
Support equijoin conditions for hash joins using non-column expressions
2025-12-04 16:09:44 -05:00
PThorpe92
3b788c2c74
Impl plumbing to incorporate hash joins into query planner and emitter
2025-12-04 16:09:44 -05:00
PThorpe92
1673b4bb3a
Add opcodes needed for hash join in vdbe
2025-12-04 16:09:44 -05:00
PThorpe92
4f0966942f
create hash_table.rs with basic implementation for hash joins
2025-12-04 16:09:44 -05:00
PThorpe92
a3a2ff85d4
Add OpHashBuildState to vdbe program state to make hash building reentrant
2025-12-04 16:09:43 -05:00
PThorpe92
8cc95d227e
Add helper function to determine if a table is the build table for a hash join
2025-12-04 16:09:43 -05:00
PThorpe92
11cbbeada4
Add new Operation::HashJoin and plumbing to prepare for adding hash joins
2025-12-04 16:09:43 -05:00
PThorpe92
96e1aba6fd
Add HashJoin to AccessParamMethods enum
2025-12-04 16:09:43 -05:00
PThorpe92
291965132e
Fix panic in optimizer when usable constraint refs is empty
2025-12-04 15:43:08 -05:00
PThorpe92
d3579bf702
Fix LEFT JOIN with multiple constraint on join key of ephemeral index
2025-12-04 14:58:08 -05:00
Pere Diaz Bou
3cca372a97
core/mvcc: clippy and some renaming
2025-12-04 19:32:43 +01:00
Pere Diaz Bou
ba1dff71d7
core/mvcc: get_next_rowid return on last io
2025-12-04 19:32:43 +01:00
Pere Diaz Bou
3b10370a30
core/mvcc: reset state on prev
2025-12-04 19:31:41 +01:00
Pere Diaz Bou
7853a47292
core/mvcc/cursor: prev
2025-12-04 19:31:41 +01:00
Pere Diaz Bou
e3cfbfbcd8
core/mvcc: state machine for next and rewind
2025-12-04 19:31:41 +01:00