Commit graph

11355 commits

Author SHA1 Message Date
Jussi Saurio
c7672b952b Use Cow for Value::Blob to prevent copies in op_column 2025-12-06 12:08:27 +02:00
Jussi Saurio
2742278e5a decouple functions from struct Value 2025-12-06 10:42:01 +02:00
Jussi Saurio
299ccdaee4
Merge 'sim: stop ignoring sql execution errors' from Jussi Saurio
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / cargo-fmt-check (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Reviewed-by: Pedro Muniz (@pedrocarlo)

Closes #4106
2025-12-05 23:04:20 +02:00
Preston Thorpe
231282a5be
Merge 'Simulator Roadmap' from Alperen Keleş
This PR is a working doc on a roadmap for the simulator. @pedrocarlo
@LeMikaelF please take a look.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #3954
2025-12-05 13:27:57 -05:00
Preston Thorpe
c09c30746e
Merge 'guard subjournal access within single connection' from Nikita Sivukhin
Right now turso can panic with various asserts if 2 or more write
statements will be executed over single connection concurrently:
```
thread 'query_processing::test_write_path::api_misuse' panicked at core/storage/pager.rs:776:9:
subjournal offset should be 0
```
This PR adds explicit guard for subjournal access which will return
`Busy` for the operation internally and lead to wait condition for the
statement until subjournal ownership will be released and can be re-
acquired again.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4110
2025-12-05 13:14:07 -05:00
Preston Thorpe
e7c7f232b4
Merge 'testing/fuzz: Add new fuzzer for joins' from Preston Thorpe
needed for #4063 to merge, currently passing on main but just want to
lower the already huge diff

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4103
2025-12-05 13:13:44 -05:00
Nikita Sivukhin
659ef7c079 fix clippy 2025-12-05 21:39:35 +04:00
Nikita Sivukhin
487854e6d6 guard subjournal access in order to prevent concurrent operations over it within same connection 2025-12-05 21:25:13 +04:00
PThorpe92
c9a6827011
Extract out join fuzzer to an additional test on indexed columns 2025-12-05 12:03:23 -05:00
Preston Thorpe
97eb482885
Merge 'Fix two bugs with compound selects' from Jussi Saurio
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / configure-strategy (push) Waiting to run
Python / linux (x86_64) (push) Waiting to run
Python / Release (push) Blocked by required conditions
Rust / test-sqlite (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Closes #4108
1. Previously compound select would not start a transaction if the
rightmost subselect had no table references, e.g. SELECT * FROM t UNION
VALUES(1)
2. Previously the column names for the query were taken from the
rightmost subselect - instead, they should be taken from the leftmost
subselect.

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4109
2025-12-05 11:51:38 -05:00
Jussi Saurio
58a25f7f5f
Merge 'Turso sdk kit version' from Nikita Sivukhin
This PR adds simple `turso_version()` function to the sdk-kit crate

Closes #4099
2025-12-05 17:27:28 +02:00
Jussi Saurio
a90087bcf6 Enable compound_select_fuzz for mvcc because it works as a regression test for #4108 2025-12-05 17:19:05 +02:00
Jussi Saurio
e0791406b5 Fix two bugs with compound selects
1. Previously compound select would not start a transaction if
   the rightmost subselect had no table references, e.g.
   SELECT * FROM t UNION VALUES(1)

2. Previously the column names for the query were taken from the
   rightmost subselect - instead, they should be taken from the
   leftmost subselect.
2025-12-05 17:14:58 +02:00
Jussi Saurio
39325fdca9
Merge 'Subsec failed to format as YYYY-MM-DD HH:MM:SS.SSS' from
Sqlite docs reference: https://sqlite.org/lang_datefunc.html
The datetime() function returns the date and time formatted as YYYY-MM-
DD HH:MM:SS or as YYYY-MM-DD HH:MM:SS.SSS if the [subsec
modifier](https://sqlite.org/lang_datefunc.html#subsec) is used.
### Failed before changes:
``` sql
INSERT INTO test_results VALUES (
    'Fixed DateTime Expansion',
    datetime('2024-01-01 12:00:00', 'subsec'), 
    '2024-01-01 12:00:00.000',
    CASE 
        WHEN datetime('2024-01-01 12:00:00', 'subsec') = '2024-01-01 12:00:00.000' THEN 'PASS' 
        ELSE 'FAIL' 
    END,
    'Adds .000 to fixed time'
);
```
<img width="1164" height="94" alt="Screenshot 2025-12-04 130827"
src="https://github.com/user-
attachments/assets/d09dbe19-d329-4b88-a727-8f92e79e0a10" />
``` sql
INSERT INTO test_results VALUES (
    'Date Only Input',
    datetime('2024-01-01', 'subsec'), 
    '2024-01-01 00:00:00.000',
    CASE WHEN datetime('2024-01-01', 'subsec') = '2024-01-01 00:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Expands date to midnight.000'
);

INSERT INTO test_results VALUES (
    'ISO "T" Separator',
    datetime('2024-01-01T15:30:00', 'subsec'), 
    '2024-01-01 15:30:00.000',
    CASE WHEN datetime('2024-01-01T15:30:00', 'subsec') = '2024-01-01 15:30:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Replaces T with space'
);

INSERT INTO test_results VALUES (
    'Chain: Subsec then Add',
    datetime('2024-01-01 12:00:00', 'subsec', '+1 hour'), 
    '2024-01-01 13:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', 'subsec', '+1 hour') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Flag persists after +1 hour'
);

INSERT INTO test_results VALUES (
    'Chain: Add then Subsec',
    datetime('2024-01-01 12:00:00', '+1 hour', 'subsec'), 
    '2024-01-01 13:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', '+1 hour', 'subsec') = '2024-01-01 13:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'Standard order works'
);

INSERT INTO test_results VALUES (
    'Case Insensitivity',
    datetime('2024-01-01 12:00:00', 'SuBsEc'), 
    '2024-01-01 12:00:00.000',
    CASE WHEN datetime('2024-01-01 12:00:00', 'SuBsEc') = '2024-01-01 12:00:00.000' THEN 'PASS' ELSE 'FAIL' END,
    'SuBsEc works'
);
```
<img width="1206" height="340" alt="Screenshot 2025-12-04 131114"
src="https://github.com/user-
attachments/assets/72fd7fe2-c991-4a92-b3d7-2f4516404fa7" />

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4097
2025-12-05 15:31:56 +02:00
Jussi Saurio
ca605b368b
Merge 'Fix IN operator translation logic' from Nikita Sivukhin
This PR replaces incorrect usage of `program.resolve_label(...)` to the
correct method `program.preassign_label_to_next_insn(...)` for IN
operator translation code.
Following query translated with bug before the fix:
```sql
turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
```
Before `emit_constant_insns` optimization query plan was correct
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     12    0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       String8          0     4     0     alice          0   r[4]='alice'
6       Eq               3     4     9     Binary         0   if r[3]==r[4] goto 9
7       String8          0     5     0     bob            0   r[5]='bob'
8       Ne               3     5     11    Binary         0   if r[3]!=r[5] goto 11
9       Integer          1     6     0                    0   r[6]=1
10      AggStep          0     6     2     count          0   accum=r[2] step(r[6])
11    Next               0     4     0                    0
12    AggFinal           0     2     0     count          0   accum=r[2]
13    Copy               2     1     0                    0   r[1]=r[2]
14    ResultRow          1     1     0                    0   output=r[1]
15    Halt               0     0     0                    0
16    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
17    Goto               0     1     0                    0
```
But the problem is that after `emit_constant_insns`,  `Eq` jump target
was rewritten as it was binded to the 9th op code (Integer 1) instead of
the next op code after the IN translated block (in the final plan, note
jump to the 16 address for Eq instruction)
```sql
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     9     0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       Eq               3     4     16    Binary         0   if r[3]==r[4] goto 16
6       Ne               3     5     8     Binary         0   if r[3]!=r[5] goto 8
7       AggStep          0     6     2     count          0   accum=r[2] step(r[6])
8     Next               0     4     0                    0
9     AggFinal           0     2     0     count          0   accum=r[2]
10    Copy               2     1     0                    0   r[1]=r[2]
11    ResultRow          1     1     0                    0   output=r[1]
12    Halt               0     0     0                    0
13    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
14    String8            0     4     0     alice          0   r[4]='alice'
15    String8            0     5     0     bob            0   r[5]='bob'
16    Integer            1     6     0                    0   r[6]=1
17    Goto               0     1     0                    0
```

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4105
2025-12-05 15:31:32 +02:00
Jussi Saurio
74296e52bb
Merge 'Automatically Propagate Encryption options' from Pedro Muniz
On database open, we store the Encryption Options and pass them onwards
to the Connection, Pager and Wal. We also have slight gain in
ergonomics, as we don't have set the Pragma's for the `cipher` and
`hexkey` on each new `Connection`.
I needed this logic, because I will need to initialize a Default Header
for empty DBs and encryption opts not being automatically propagated was
hindering me for this.
**Ai Disclosure**
Claude helped me debug and find out issues in my implementation
cc @avinassh

Reviewed-by: Avinash Sajjanshetty (@avinassh)

Closes #4100
2025-12-05 15:31:17 +02:00
Nikita Sivukhin
d5f58de801 fix clippy 2025-12-05 15:29:17 +04:00
Jussi Saurio
77ad9d87c5 sim: stop ignoring sql execution errors 2025-12-05 13:14:39 +02:00
Nikita Sivukhin
17695438ac fix IN operator translation bug
- following query translated with bug before the fix:

turso> CREATE TABLE t (id INTEGER PRIMARY KEY, name TEXT);
turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');

- before emit_constant_insns optimization query plan was correct

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     16    0                    0   Start at 16
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     12    0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       String8          0     4     0     alice          0   r[4]='alice'
6       Eq               3     4     9     Binary         0   if r[3]==r[4] goto 9
7       String8          0     5     0     bob            0   r[5]='bob'
8       Ne               3     5     11    Binary         0   if r[3]!=r[5] goto 11
9       Integer          1     6     0                    0   r[6]=1
10      AggStep          0     6     2     count          0   accum=r[2] step(r[6])
11    Next               0     4     0                    0
12    AggFinal           0     2     0     count          0   accum=r[2]
13    Copy               2     1     0                    0   r[1]=r[2]
14    ResultRow          1     1     0                    0   output=r[1]
15    Halt               0     0     0                    0
16    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
17    Goto               0     1     0                    0

- but the problem is that after emit_constant_insns Eq jump target was rewritten as it was binded to the 9th op code (Integer 1) instead of the next op code after the IN translated block (in the final plan, note jump to the 16 address for Eq instruction)

turso> EXPLAIN SELECT COUNT(*) FROM t WHERE name in ('alice', 'bob');
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     13    0                    0   Start at 13
1     Null               0     2     0                    0   r[2]=NULL
2     OpenRead           0     2     0                    0   table=t, root=2, iDb=0
3     Rewind             0     9     0                    0   Rewind table t
4       Column           0     1     3                    0   r[3]=t.name
5       Eq               3     4     16    Binary         0   if r[3]==r[4] goto 16
6       Ne               3     5     8     Binary         0   if r[3]!=r[5] goto 8
7       AggStep          0     6     2     count          0   accum=r[2] step(r[6])
8     Next               0     4     0                    0
9     AggFinal           0     2     0     count          0   accum=r[2]
10    Copy               2     1     0                    0   r[1]=r[2]
11    ResultRow          1     1     0                    0   output=r[1]
12    Halt               0     0     0                    0
13    Transaction        0     1     1                    0   iDb=0 tx_mode=Read
14    String8            0     4     0     alice          0   r[4]='alice'
15    String8            0     5     0     bob            0   r[5]='bob'
16    Integer            1     6     0                    0   r[6]=1
17    Goto               0     1     0                    0
2025-12-05 14:52:09 +04:00
Nikita Sivukhin
e839eb499b make fuzzer to generate SELECT COUNT(*) OR SELECT * statements
- this is important for IN operation translation bug because in case of COUNT(*) there is constant assignment instruction right after last instruction translated from IN condition
2025-12-05 14:48:59 +04:00
Jussi Saurio
6da0895813
Merge 'fix/mvcc: seek() must seek from both mv store and btree' from Jussi Saurio
Closes #4093
for example, upon opening an existing database, all the rows are in the
btree, so if we seek only from MV store, we won't find anything. ergo:
we must look from both the mv store and the btree. if we are iterating
forwards, the smallest of the two results is where we land, and vice
versa for backwards iteration.
---
this PR was almost entirely written using Claude Code + Opus 4.5, and
here is Claude's gangsta rap about the changes:
> Can you write a gangsta rap that summarizes everything you changed in
this session? I.e. what was fixed and how
Here's the audio version generated using Suno:
https://suno.com/song/2209deaf-4c59-43c5-b0e9-d807b9805dc0
⏺ "Dual Seek Chronicles"
  (verse 1)
  Yo, the bug was real, seek was half-baked,
  Only checked MVCC, btree forsaked,
  User said "look, when both stores got data,
  Seek gotta check both—fix it now, not later."
  (pre-chorus)
  See the cursor got two sources of truth,
  Btree on disk, MVCC in the booth,
  After checkpoint, data splits in two,
  Seek was blind to half—that just won't do.
  (chorus)
  Now we seek in MVCC, seek in btree,
  Pick the winner based on which direction we be,
  Forwards? Smallest key wins the race,
  Backwards? Biggest key takes first place.
  Dual iteration, that's the game we play,
  seek_btree_and_set_peek saves the day!
  (verse 2)
  But wait—SeekResult got three states to check,
  Found means we good, NotFound means we wreck,
  TryAdvance though? Cursor's positioned but stuck,
  Gotta call next() or prev() for luck.
  Added helper methods, kept the code tight,
  advance_btree_forward_from_current done right,
  But then we noticed duplication in the mix,
  Refactored with a bool—that's the slickest fix.
  (bridge)
  _advance_btree_forward(initialize: bool),
  One function to rule them, that's the protocol,
  true means rewind if uninitialized state,
  false means we seeked—just iterate.
  (verse 3)
  Dropped six tests in test_transactions.rs,
  Checkpoint, insert, seek—put it to the test,
  Interleaved rows, odd in btree, even in MV,
  Updates and deletes, shadow what you see.
  Range operations: GT, LT, GE, LE,
  All forty-three tests pass, we're bug-free,
  From Uninitialized to Row or Exhausted state,
  Turso's dual cursor finally running straight.
  (outro)
  That's the story of the dual-seek fix,
  MVCC and btree no longer in conflicts,
  Ship it to main, let the users feast,
  Cursor seeking both stores—east to west, west to east.
  🎤
  EDIT:
  After merge of #4095 , I refactored this PR to use state machines
instead of blocking IO as well. Initial implementation was all Claude
Code + Opus 4.5 there too, but the state machine code was FAR too
complex so I manually cleaned up and simplified it.

Closes #4094
2025-12-05 12:07:49 +02:00
Jussi Saurio
eb782ce2d4 fix/mvcc: seek() must seek from both mv store and btree
for example, upon opening an existing database, all the rows are in
the btree, so if we seek only from MV store, we won't find anything.
ergo: we must look from both the mv store and the btree. if we are
iterating forwards, the smallest of the two results is where we land,
and vice versa for backwards iteration.

initially this implementation used blocking IO but was refactored to
use state machines after the rest of the Cursor methods in the MVCC cursor
module were refactored to do that too.

---

this PR was initially almost entirely written using Claude Code + Opus 4.5,
but heavily manually cleaned up as the AI made the state machine refactor
far too complicated.
2025-12-05 11:53:16 +02:00
Jussi Saurio
8f57c60c26
Merge 'core/mvcc: state machines for prev, next, exists, rewind, last' from Pere Diaz Bou
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / lint (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / linux (x86_64) (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / Release (push) Blocked by required conditions
Python / sdist (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Basically bring back state machines for: `next`, `prev`, `last`,
`rewind`, `get_next_rowid`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4095
2025-12-05 10:08:00 +02:00
pedrocarlo
ee73bab743 get correct reserved bytes if Cipher is not None 2025-12-05 02:04:06 -03:00
pedrocarlo
a311c966a2 set encryption context for page and wal in init_pager 2025-12-05 02:04:06 -03:00
pedrocarlo
889322f6b5 do not call pragmas related to encryption on connect or open 2025-12-05 02:04:06 -03:00
pedrocarlo
0118a65169 pass encryption opts from the database to the connection on connect 2025-12-05 02:04:06 -03:00
pedrocarlo
85b212056d separate init function for connect 2025-12-05 02:04:06 -03:00
pedrocarlo
1a43de35ce add encryption key and cipher to Database struct 2025-12-05 02:04:06 -03:00
pedrocarlo
faca85de2f pass pager to _connect and share initial coon for boostrapping mvcc 2025-12-05 02:04:05 -03:00
Preston Thorpe
5b3262ba18
Merge 'add lib-release profile' from Nikita Sivukhin
Some checks are pending
Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run
Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions
Python / lint (push) Waiting to run
Python / macos-arm64 (aarch64) (push) Waiting to run
Python / sdist (push) Waiting to run
Python / Release (push) Blocked by required conditions
Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions
Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions
Python / configure-strategy (push) Waiting to run
Python / test (push) Blocked by required conditions
Python / linux (x86_64) (push) Waiting to run
Rust / test-sqlite (push) Waiting to run
Rust / cargo-fmt-check (push) Waiting to run
Rust / build-native (macos-latest) (push) Waiting to run
Rust / build-native (windows-latest) (push) Waiting to run
Rust / clippy (push) Waiting to run
Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run
Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run
Rust / simulator (push) Waiting to run
Rust / test-limbo (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run
Rust Benchmarks+Nyrkiö / bench (push) Waiting to run
Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run
Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run
This profile will provide another trade-off between performance and
library final size

Reviewed-by: Preston Thorpe <preston@turso.tech>

Closes #4096
2025-12-04 18:40:25 -05:00
Preston Thorpe
227c8ad5d0
Merge 'planner/vdbe: implement Hash Joins as an alternative to Ephemeral Indexes ' from Preston Thorpe
This PR adds a hash table data structure that partitions and spills to
disk when a memory limit is exceeded, which is used (a bit
conservatively) with the following opcodes for the VDBE:
`HashBuild`: feed a build-side row into the hash table.
`HashBuildFinalize`: finalize the build, potentially spilling
partitions.
`HashProbe`:  hash and probe a key, yielding the build-side rowid(s).
`HashNext`:  iterate through additional matches in the same bucket.
`HashClose`: free the hash table.
There is pretty extensive doc comments for both the hash table
implementation and for the planner/access method selection.
The basic idea (for now) is that one ephemeral index is replaced with a
hash table and we get
So far it's showing significant perf increase:
### Query:
```sql
select users.id, products.id, o.id from users 
join products on products.name = users.first_name 
join order_items o on o.unit_price = products.price;
```
**Before**: `total: 20.918470388 s`
**After**: `total: 289 ms`
(this is with `main` building an ephemeral index for one but doing a
full scan for two outermost tables)
#### Bytecode
```sql
addr  opcode             p1    p2    p3    p4             p5  comment
----  -----------------  ----  ----  ----  -------------  --  -------
0     Init               0     47    0                    0   Start at 47
1     OpenRead           0     2     0                    0   table=users, root=2, iDb=0
2     OpenRead           1     3     0                    0   table=products, root=3, iDb=0
3     OpenRead           2     72    0                    0   table=order_items, root=72, iDb=0
4     Rewind             1     45    0                    0   Rewind table products
5       Once             14    0     0                    0   goto 14
6       OpenRead         4     2     0                    0   =users, root=2, iDb=0
7       Rewind           4     13    0                    0   Rewind  users
8         Column         4     1     4                    0   r[4]=users.first_name
9         RowId          4     5     0                    0   r[5]=users.rowid
10        Column         4     1     6                    0   r[6]=users.first_name
11        HashBuild      4     4     1     r=[1] budget=67108864 payload=r[5]..r[6]  0   
12      Next             4     8     0                    0   
13      HashBuildFinalize  1     0     0                    0   
14      Column           1     1     7                    0   r[7]=products.name
15      HashProbe        1     7     1     r[10]=44 payload=r[8]..r[9]  0   
16      Once             25    0     0                    0   goto 25
17      OpenAutoindex    3     0     0                    0   cursor=3
18      Rewind           2     19    0                    0   Rewind table order_items
19        Column         2     4     11                   0   r[11]=order_items.unit_price
20        RowId          2     12    0                    0   r[12]=order_items.rowid
21        RowId          2     13    0                    0   r[13]=order_items.rowid
22        MakeRecord     11    3     14                   0   r[14]=mkrec(r[11..13]); for ephemeral_order_items_t3
23        IdxInsert      3     14    11                   0   key=r[14]
24      Next             2     19    0                    0   
25      Column           1     2     15                   0   r[15]=products.price
26      RealAffinity     15    0     0                    0   
27      IsNull           15    42    0                    0   if (r[15]==NULL) goto 42
28      Affinity         15    1     0                    0   r[15..16] = C
29      SeekGE           3     42    15                   0   key=[15..15]
30        IdxGT          3     42    15                   0   key=[15..15]
31        DeferredSeek   3     2     0                    0   
32        Column         3     0     17                   0   r[17]=ephemeral_order_items_t3.unit_price
33        RealAffinity   17    0     0                    0   
34        Column         1     2     18                   0   r[18]=products.price
35        RealAffinity   18    0     0                    0   
36        Ne             17    18    41    Binary         0   if r[17]!=r[18] goto 41
37        Copy           8     1     0                    0   r[1]=r[8]
38        RowId          1     2     0                    0   r[2]=products.rowid
39        IdxRowId       3     3     0                    0   r[3]=cursor 3 for index ephemeral_order_items_t3.rowid
40        ResultRow      1     3     0                    0   output=r[1..3]
41      Next             3     30    0                    0   
42      HashNext         1     10    44     payload=r[8]..r[9]  0   
43      Goto             0     16    0                    0   
44    Next               1     14    0                    0   
45    HashClose          1     0     0                    0   
46    Halt               0     0     0                    0   
47    Transaction        0     1     28                   0   iDb=0 tx_mode=Read
48    Goto               0     1     0                    0   
```
Comparison with `sqlite3`:
**query**:
```sql
select users.id, products.id from users join products on products.name = users.first_name;
```
15k rows in each table, no index on either column
sqlite: `22ms`
tursodb: `7ms`
I ran the simulator for 5000 tests each for a run of 200 and all of them
passed.. keeping in mind that the default limit at which a table will
spill to disk for debug build is only 32kb so it should be frequent.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #4063
2025-12-04 18:38:06 -05:00
PThorpe92
170ed26732
Fix NULL handling of hash join comparison and bug in cursor override logic 2025-12-04 17:21:46 -05:00
PThorpe92
16a1940d3a
Reduce iteration count in join fuzzer to 2000 2025-12-04 16:23:46 -05:00
PThorpe92
bf6038f0ba
Add join fuzzer for non-indexed columns 2025-12-04 16:19:19 -05:00
PThorpe92
c0d35dbaba
Add some additional join tests with nulls for new bloom filter behavior 2025-12-04 16:09:49 -05:00
PThorpe92
d0f15d1537
Fix bloom filter impl to skip NULL values 2025-12-04 16:09:49 -05:00
PThorpe92
4225bec5e9
Apply some review suggestions 2025-12-04 16:09:49 -05:00
PThorpe92
f3affbba2b
Fix condition evaluation that references hash build table now that its no longer in join order 2025-12-04 16:09:49 -05:00
PThorpe92
e8ef5dbf06
Cache result column values in the hashtable to prevent additional SeekRowID 2025-12-04 16:09:48 -05:00
PThorpe92
ad7d34bb67
Make hash joins follow pattern of ephemeral indexes instead of hoisting special logic 2025-12-04 16:09:48 -05:00
PThorpe92
eebb4950c6
Apply some review suggestions/cleanups 2025-12-04 16:09:48 -05:00
PThorpe92
866b153a7a
Fix clippy warnings 2025-12-04 16:09:47 -05:00
PThorpe92
746fe82159
Add/update some comments in the query planner 2025-12-04 16:09:47 -05:00
PThorpe92
06cbe8c063
Add more py tests for hash joins 2025-12-04 16:09:47 -05:00
PThorpe92
bf61feb6bc
Fix hash join planning/emission for > 2 way joins 2025-12-04 16:09:47 -05:00
PThorpe92
fd93544af3
Remove unneeded label resolution in close_loop method for hash join build tbl 2025-12-04 16:09:46 -05:00
PThorpe92
9b274d9243
Prevent using hash join for any query with an index on a join key 2025-12-04 16:09:46 -05:00
PThorpe92
bc5406be04
Add memory py tests for hash spilling, fix/finish spilling impl in hashtable 2025-12-04 16:09:46 -05:00
PThorpe92
bea9648a98
Fix hash join spilling for async IO by keeping buffer alive in callback 2025-12-04 16:09:45 -05:00