Commit graph

6349 commits

Author SHA1 Message Date
Jussi Saurio
c7672b952b Use Cow for Value::Blob to prevent copies in op_column 2025-12-06 12:08:27 +02:00
Jussi Saurio
2742278e5a decouple functions from struct Value 2025-12-06 10:42:01 +02:00
Preston Thorpe
c09c30746e
Merge 'guard subjournal access within single connection' from Nikita Sivukhin
Right now turso can panic with various asserts if 2 or more write
statements will be executed over single connection concurrently:
```
thread 'query_processing::test_write_path::api_misuse' panicked at core/storage/pager.rs:776:9:
subjournal offset should be 0
```
This PR adds explicit guard for subjournal access which will return
`Busy` for the operation internally and lead to wait condition for the
statement until subjournal ownership will be released and can be re-
acquired again.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4110
2025-12-05 13:14:07 -05:00
Nikita Sivukhin
659ef7c079 fix clippy 2025-12-05 21:39:35 +04:00
Nikita Sivukhin
487854e6d6 guard subjournal access in order to prevent concurrent operations over it within same connection 2025-12-05 21:25:13 +04:00
Jussi Saurio
e0791406b5 Fix two bugs with compound selects
1. Previously compound select would not start a transaction if
   the rightmost subselect had no table references, e.g.
   SELECT * FROM t UNION VALUES(1)

2. Previously the column names for the query were taken from the
   rightmost subselect - instead, they should be taken from the
   leftmost subselect.
2025-12-05 17:14:58 +02:00
Jussi Saurio
39325fdca9
Merge 'Subsec failed to format as YYYY-MM-DD HH:MM:SS.SSS' from
Sqlite docs reference: https://sqlite.org/lang_datefunc.html
The datetime() function returns the date and time formatted as YYYY-MM-
DD HH:MM:SS or as YYYY-MM-DD HH:MM:SS.SSS if the [subsec
modifier](https://sqlite.org/lang_datefunc.html#subsec) is used.
### Failed before changes:
``` sql
INSERT INTO test_results VALUES (
    'Fixed DateTime Expansion',
    datetime('2024-01-01 12:00:00', 'subsec'), 
    '2024-01-01 12:00:00.000',
    CASE 
        WHEN datetime('2024-01-01 12:00:00', 'subsec') = '2024-01-01 12:00:00.000' THEN 'PASS' 
        ELSE 'FAIL' 
    END,
    'Adds .000 to fixed time'
);
```
<img width="1164" height="94" alt="Screenshot 2025-12-04 130827"
src="https://github.com/user-
attachments/assets/d09dbe19-d329-4b88-a727-8f92e79e0a10" />
``` sql
INSERT INTO test_results VALUES (
    'Date Only Input',
    datetime('2024-01-01', 'subsec'), 
    '2024-01-01 00:00:00.000',
    CASE WHEN datetime('2024-01-01', 'subsec') = '2024-01-01 00:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Expands date to midnight.000'
);

INSERT INTO test_results VALUES (
    'ISO "T" Separator',
    datetime('2024-01-01T15:30:00', 'subsec'), 
    '2024-01-01 15:30:00.000',
    CASE WHEN datetime('2024-01-01T15:30:00', 'subsec') = '2024-01-01 15:30:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Replaces T with space'
);

INSERT INTO test_results VALUES (
    'Chain: Subsec then Add',
    datetime('2024-01-01 12:00:00', 'subsec', '+1 hour'), 
    '2024-01-01 13:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', 'subsec', '+1 hour') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Flag persists after +1 hour'
);

INSERT INTO test_results VALUES (
    'Chain: Add then Subsec',
    datetime('2024-01-01 12:00:00', '+1 hour', 'subsec'), 
    '2024-01-01 13:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', '+1 hour', 'subsec') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Standard order works'
);

INSERT INTO test_results VALUES (
    'Case Insensitivity',
    datetime('2024-01-01 12:00:00', 'SuBsEc'), 
    '2024-01-01 12:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', 'SuBsEc') = '2024-01-01 12:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'SuBsEc works'
);
```
<img width="1206" height="340" alt="Screenshot 2025-12-04 131114"
src="https://github.com/user-
attachments/assets/72fd7fe2-c991-4a92-b3d7-2f4516404fa7" />

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4097
2025-12-05 15:31:56 +02:00
Jussi Saurio
ca605b368b
Merge 'Fix IN operator translation logic' from Nikita Sivukhin
This PR replaces incorrect usage of `program.resolve_label(...)` to the
correct method `program.preassign_label_to_next_insn(...)` for IN
operator translation code.
Following query translated with bug before the fix:
```sql
turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
```
Before `emit_constant_insns` optimization query plan was correct
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     12    0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       String8          0     4     0     alice          0   r[4]='alice'
6       Eq               3     4     9     Binary         0   if r[3]==r[4] goto 9
7       String8          0     5     0     bob            0   r[5]='bob'
8       Ne               3     5     11    Binary         0   if r[3]!=r[5] goto 11
9       Integer          1     6     0                    0   r[6]=1
10      AggStep          0     6     2     count          0   accum=r[2] step(r[6])
11    Next               0     4     0                    0
12    AggFinal           0     2     0     count          0   accum=r[2]
13    Copy               2     1     0                    0   r[1]=r[2]
14    ResultRow          1     1     0                    0   output=r[1]
15    Halt               0     0     0                    0
16    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
17    Goto               0     1     0                    0
```
But the problem is that after `emit_constant_insns`,  `Eq` jump target
was rewritten as it was binded to the 9th op code (Integer 1) instead of
the next op code after the IN translated block (in the final plan, note
jump to the 16 address for Eq instruction)
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     9     0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       Eq               3     4     16    Binary         0   if r[3]==r[4] goto 16
6       Ne               3     5     8     Binary         0   if r[3]!=r[5] goto 8
7       AggStep          0     6     2     count          0   accum=r[2] step(r[6])
8     Next               0     4     0                    0
9     AggFinal           0     2     0     count          0   accum=r[2]
10    Copy               2     1     0                    0   r[1]=r[2]
11    ResultRow          1     1     0                    0   output=r[1]
12    Halt               0     0     0                    0
13    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
14    String8            0     4     0     alice          0   r[4]='alice'
15    String8            0     5     0     bob            0   r[5]='bob'
16    Integer            1     6     0                    0   r[6]=1
17    Goto               0     1     0                    0
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4105
2025-12-05 15:31:32 +02:00
Jussi Saurio
74296e52bb
Merge 'Automatically Propagate Encryption options' from Pedro Muniz
On database open, we store the Encryption Options and pass them onwards
to the Connection, Pager and Wal. We also have slight gain in
ergonomics, as we don't have set the Pragma's for the `cipher` and
`hexkey` on each new `Connection`.
I needed this logic, because I will need to initialize a Default Header
for empty DBs and encryption opts not being automatically propagated was
hindering me for this.
**Ai Disclosure**
Claude helped me debug and find out issues in my implementation
cc @avinassh

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #4100
2025-12-05 15:31:17 +02:00
Nikita Sivukhin
17695438ac fix IN operator translation bug
- following query translated with bug before the fix:

turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');

- before emit_constant_insns optimization query plan was correct

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     12    0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       String8          0     4     0     alice          0   r[4]='alice'
6       Eq               3     4     9     Binary         0   if r[3]==r[4] goto 9
7       String8          0     5     0     bob            0   r[5]='bob'
8       Ne               3     5     11    Binary         0   if r[3]!=r[5] goto 11
9       Integer          1     6     0                    0   r[6]=1
10      AggStep          0     6     2     count          0   accum=r[2] step(r[6])
11    Next               0     4     0                    0
12    AggFinal           0     2     0     count          0   accum=r[2]
13    Copy               2     1     0                    0   r[1]=r[2]
14    ResultRow          1     1     0                    0   output=r[1]
15    Halt               0     0     0                    0
16    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
17    Goto               0     1     0                    0

- but the problem is that after emit_constant_insns Eq jump target was rewritten as it was binded to the 9th op code (Integer 1) instead of the next op code after the IN translated block (in the final plan, note jump to the 16 address for Eq instruction)

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     9     0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       Eq               3     4     16    Binary         0   if r[3]==r[4] goto 16
6       Ne               3     5     8     Binary         0   if r[3]!=r[5] goto 8
7       AggStep          0     6     2     count          0   accum=r[2] step(r[6])
8     Next               0     4     0                    0
9     AggFinal           0     2     0     count          0   accum=r[2]
10    Copy               2     1     0                    0   r[1]=r[2]
11    ResultRow          1     1     0                    0   output=r[1]
12    Halt               0     0     0                    0
13    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
14    String8            0     4     0     alice          0   r[4]='alice'
15    String8            0     5     0     bob            0   r[5]='bob'
16    Integer            1     6     0                    0   r[6]=1
17    Goto               0     1     0                    0
2025-12-05 14:52:09 +04:00
Jussi Saurio
eb782ce2d4 fix/mvcc: seek() must seek from both mv store and btree
for example, upon opening an existing database, all the rows are in
the btree, so if we seek only from MV store, we won't find anything.
ergo: we must look from both the mv store and the btree. if we are
iterating forwards, the smallest of the two results is where we land,
and vice versa for backwards iteration.

initially this implementation used blocking IO but was refactored to
use state machines after the rest of the Cursor methods in the MVCC cursor
module were refactored to do that too.

---

this PR was initially almost entirely written using Claude Code + Opus 4.5,
but heavily manually cleaned up as the AI made the state machine refactor
far too complicated.
2025-12-05 11:53:16 +02:00
Jussi Saurio
8f57c60c26
Merge 'core/mvcc: state machines for prev, next, exists, rewind, last' from Pere Diaz Bou
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / lint (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / Release (push) Blocked by required conditions
Python / sdist (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Basically bring back state machines for: `next`, `prev`, `last`,
`rewind`, `get_next_rowid`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4095
2025-12-05 10:08:00 +02:00
pedrocarlo
ee73bab743 get correct reserved bytes if Cipher is not None 2025-12-05 02:04:06 -03:00
pedrocarlo
a311c966a2 set encryption context for page and wal in init_pager 2025-12-05 02:04:06 -03:00
pedrocarlo
889322f6b5 do not call pragmas related to encryption on connect or open 2025-12-05 02:04:06 -03:00
pedrocarlo
0118a65169 pass encryption opts from the database to the connection on connect 2025-12-05 02:04:06 -03:00
pedrocarlo
85b212056d separate init function for connect 2025-12-05 02:04:06 -03:00
pedrocarlo
1a43de35ce add encryption key and cipher to Database struct 2025-12-05 02:04:06 -03:00
pedrocarlo
faca85de2f pass pager to _connect and share initial coon for boostrapping mvcc 2025-12-05 02:04:05 -03:00
PThorpe92
170ed26732
Fix NULL handling of hash join comparison and bug in cursor override logic 2025-12-04 17:21:46 -05:00
PThorpe92
d0f15d1537
Fix bloom filter impl to skip NULL values 2025-12-04 16:09:49 -05:00
PThorpe92
4225bec5e9
Apply some review suggestions 2025-12-04 16:09:49 -05:00
PThorpe92
f3affbba2b
Fix condition evaluation that references hash build table now that its no longer in join order 2025-12-04 16:09:49 -05:00
PThorpe92
e8ef5dbf06
Cache result column values in the hashtable to prevent additional SeekRowID 2025-12-04 16:09:48 -05:00
PThorpe92
ad7d34bb67
Make hash joins follow pattern of ephemeral indexes instead of hoisting special logic 2025-12-04 16:09:48 -05:00
PThorpe92
eebb4950c6
Apply some review suggestions/cleanups 2025-12-04 16:09:48 -05:00
PThorpe92
866b153a7a
Fix clippy warnings 2025-12-04 16:09:47 -05:00
PThorpe92
746fe82159
Add/update some comments in the query planner 2025-12-04 16:09:47 -05:00
PThorpe92
bf61feb6bc
Fix hash join planning/emission for > 2 way joins 2025-12-04 16:09:47 -05:00
PThorpe92
fd93544af3
Remove unneeded label resolution in close_loop method for hash join build tbl 2025-12-04 16:09:46 -05:00
PThorpe92
9b274d9243
Prevent using hash join for any query with an index on a join key 2025-12-04 16:09:46 -05:00
PThorpe92
bc5406be04
Add memory py tests for hash spilling, fix/finish spilling impl in hashtable 2025-12-04 16:09:46 -05:00
PThorpe92
bea9648a98
Fix hash join spilling for async IO by keeping buffer alive in callback 2025-12-04 16:09:45 -05:00
PThorpe92
0da294c915
Implement spilling to disk for hash joins 2025-12-04 16:09:45 -05:00
PThorpe92
f6e678c133
Fix cardinality in selecting jash join in optimizer 2025-12-04 16:09:45 -05:00
PThorpe92
5c1698da45
Support equijoin conditions for hash joins using non-column expressions 2025-12-04 16:09:44 -05:00
PThorpe92
3b788c2c74
Impl plumbing to incorporate hash joins into query planner and emitter 2025-12-04 16:09:44 -05:00
PThorpe92
1673b4bb3a
Add opcodes needed for hash join in vdbe 2025-12-04 16:09:44 -05:00
PThorpe92
4f0966942f
create hash_table.rs with basic implementation for hash joins 2025-12-04 16:09:44 -05:00
PThorpe92
a3a2ff85d4
Add OpHashBuildState to vdbe program state to make hash building reentrant 2025-12-04 16:09:43 -05:00
PThorpe92
8cc95d227e
Add helper function to determine if a table is the build table for a hash join 2025-12-04 16:09:43 -05:00
PThorpe92
11cbbeada4
Add new Operation::HashJoin and plumbing to prepare for adding hash joins 2025-12-04 16:09:43 -05:00
PThorpe92
96e1aba6fd
Add HashJoin to AccessParamMethods enum 2025-12-04 16:09:43 -05:00
PThorpe92
291965132e
Fix panic in optimizer when usable constraint refs is empty 2025-12-04 15:43:08 -05:00
PThorpe92
d3579bf702
Fix LEFT JOIN with multiple constraint on join key of ephemeral index 2025-12-04 14:58:08 -05:00
Pere Diaz Bou
3cca372a97 core/mvcc: clippy and some renaming 2025-12-04 19:32:43 +01:00
Pere Diaz Bou
ba1dff71d7 core/mvcc: get_next_rowid return on last io 2025-12-04 19:32:43 +01:00
Pere Diaz Bou
3b10370a30 core/mvcc: reset state on prev 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
7853a47292 core/mvcc/cursor: prev 2025-12-04 19:31:41 +01:00
Pere Diaz Bou
e3cfbfbcd8 core/mvcc: state machine for next and rewind 2025-12-04 19:31:41 +01:00