limbo

mirror of https://github.com/tursodatabase/limbo.git synced 2025-12-23 08:21:09 +00:00

Author	SHA1	Message	Date
Jussi Saurio	c282c24d94	Merge 'clean up core tester to use `conn.execute` and `conn.exec_rows` for parsing correctly the expected values from select queries' from Pedro Muniz ## Description The PR title. `exec_rows` also does validation of outputs automatically which is good practice for testing <!-- Please include a summary of the changes and the related issue. --> ## Motivation and context Better typing and don't have to constantly match on `turso_core::Value` <!-- Please include relevant motivation and context. Link relevant issues here. --> ## AI Disclosure Ai did most of the migration <!-- Please disclose if any LLM's were used in the creation of this PR and to what extent, to help maintainers properly review. --> Closes #4192	2025-12-18 09:22:45 +02:00
Jussi Saurio	00d266665b	Merge 'fix coroutine panic: replace ended_coroutine Bitfield with vec' from Jussi Saurio ## Description Closes #4146 ## Motivation and context panics are bad ## AI Disclosure none used Reviewed-by: Pedro Muniz (@pedrocarlo) Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Closes #4177	2025-12-18 09:20:05 +02:00
pedrocarlo	3df4f46d80	minimal repro regression test for starting in MVCC and later switching to WAL and back to MVCC and then updating and deleting a row that only existed in the BTree	2025-12-17 10:55:25 -03:00
pedrocarlo	8f1dcbf625	add regression test for a delete being lost on switch to wal mode	2025-12-17 10:55:25 -03:00
pedrocarlo	bd4f4d9aa5	add fuzz tests for `jounral_mode`	2025-12-17 10:55:25 -03:00
pedrocarlo	d73a283136	fix index_scan_compound_key_fuzz, as it always open the same database in both limbo and sqlite. But with the version changes we cannot do that. We need separate databases for each	2025-12-17 10:55:25 -03:00
pedrocarlo	046e6a884d	use exec rows for header version test	2025-12-17 10:55:25 -03:00
pedrocarlo	33afc3015c	adjust test + remove mv store after transitioning to wal mode	2025-12-17 10:55:25 -03:00
pedrocarlo	e54d3328c0	after checkpoint get header from pager to properly persist change in header	2025-12-17 10:55:25 -03:00
pedrocarlo	257dc5ad09	do not initiate a write transaction for journal mode + checkpoint before changing mode	2025-12-17 10:55:24 -03:00
pedrocarlo	277f9928b7	test changing from WAL to MVCC	2025-12-17 10:55:24 -03:00
pedrocarlo	b948065f22	skip rusqlite integrity check if db is mvcc	2025-12-17 10:55:24 -03:00
pedrocarlo	0cbe904cef	ensure DB header is flushed to DB file if header changes during DB open	2025-12-17 10:55:24 -03:00
Pere Diaz Bou	77841042d0	Merge 'Consider Order by expressions collation when deciding candidate index for iteration' from Pedro Muniz ## Description Does solve #4154, but I don't want to close it with this PR, because it does not solve the Affinity issue. We can only use an index to iterate over if the column collation in the order by clause matches the index collation <!-- Please include a summary of the changes and the related issue. --> ## Motivation and context Fix a bug in the optimizer <!-- Please include relevant motivation and context. Link relevant issues here. --> ## Description of AI Usage Used AI to write tests, fuzzers, and help me understand the optimizer code. Test prompt: <details> can you write tests in tcl that test that the correct collation sequence is properly maintained. ``` CREATE TABLE "t1" ("c1" TEXT COLLATE RTRIM); INSERT INTO "t1" VALUES (' '); CREATE INDEX "i1" ON "t1" ("c1" COLLATE RTRIM DESC); INSERT INTO "t1" VALUES (1025.1655084065987); SELECT "c1", typeof(c1) FROM "t1" ORDER BY "c1" COLLATE BINARY DESC, rowid ASC; ``` this is an example of a query that returned incorrect results because of this </details> <!-- Please disclose how AI was used to help create this PR. For example, you can share prompts, specific tools, or ways of working that you took advantage of. You can also share whether the creation of the PR was mainly driven by AI, or whether it was used for assistance. This is a good way of sharing knowledge to other contributors about how we can work more efficiently with AI tools. Note that the use of AI is encouraged, but the committer is still fully responsible for understanding and reviewing the output. --> Reviewed-by: Pere Diaz Bou <pere-altea@homail.com> Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4248	2025-12-17 14:26:25 +01:00
pedrocarlo	398c82fdf1	clippy	2025-12-16 23:11:31 -03:00
pedrocarlo	0be4a885f1	adjust fuzz tests to account for collation and sort order for asserting correctness	2025-12-16 23:09:00 -03:00
pedrocarlo	4c157e8c7a	add AI fuzz tests	2025-12-16 15:27:07 -03:00
Pekka Enberg	98308415b4	core: Don't rollback transaction when schema updated When the SchemaUpdated error occurs during statement execution, don't roll back the transaction, but instead re-prepare the statement. Spotted by Whopper.	2025-12-15 13:49:21 +02:00
PThorpe92	12601af1e1	increase lantency check for flaky test in test_read_path.rs	2025-12-12 13:49:56 -05:00
pedrocarlo	60ab032e3a	clean up core tester to use `conn.execute` instead of `limbo_exec_rows` and use `conn.exec_rows` for parsing correctly the expected values from select queries	2025-12-12 12:36:48 -03:00
Jussi Saurio	216f4d71ee	cargo fmt	2025-12-11 23:37:19 +02:00
Jussi Saurio	6280ab4a7b	Regression test for 4146	2025-12-11 23:37:19 +02:00
Jussi Saurio	faa1197e58	Add greedy join ordering for large queries (>12 tables) Problem: The existing DP-based join optimizer has O(2^n) complexity, which causes large joins to basically not get past the planning phase. Fix: Add a greedy algorithm that runs in O(n²) time for >12 tables. Details: - Add compute_greedy_join_order() with hub score heuristic for selecting the starting table. Tables referenced by many other tables' constraints are preferred, enabling index lookups on subsequent joins. This is especially good for star schema queries. - Add GREEDY_JOIN_THRESHOLD constant (12) for switchover point - Add fuzz tests covering star schemas, chains, cliques up to 62 tables, and LEFT JOIN ordering invariants (RHS of a left join cannot be reordered). - Not all the tests necessarily assert that a query results in a good plan (apart from star schemas), but all tests do assert that we are _able_ to construct a plan (unlike before, where even 32-way joins would grind to a halt). AI usage: - Pretty much all of this was a conversation between me and Opus 4.5. I asked it to search the internet for practical solutions to the problem and it suggested a simple greedy search as a low-complexity solution and I thought it was a good idea for now.	2025-12-11 09:31:38 +02:00
pedrocarlo	2a449f8f6b	revert change in index_scan_compound_key_fuzz	2025-12-10 23:38:41 -03:00
pedrocarlo	bc16588273	index and rowid fuzz should open a separate sqlite database for comparison	2025-12-10 15:38:02 -03:00
pedrocarlo	ffbbd4c270	add exec rows trait for more ergonomic testing in `core_tester`	2025-12-10 15:21:03 -03:00
pedrocarlo	c207eddd3f	remove unused TempDatabase argument requirement for `limbo_exec_rows`	2025-12-10 15:21:03 -03:00
Jussi Saurio	64dba96c60	Merge 'initialize global header on bootstrap' from Pedro Muniz On bootstrap just store the header but not flush it to disk. Only try to flush it when we start an MVCC transaction. Also applied fix in `OpenDup` where we should not wrap an ephemeral table with an MvCursor Reviewed-by: Mikaël Francoeur (@LeMikaelF) Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4151	2025-12-10 19:04:23 +02:00
Jussi Saurio	0d35366f5d	Merge 'Fix CTE scope propagation for compound SELECTs' from Martin Mauch CTEs now work correctly when combined with UNION, UNION ALL, INTERSECT, and EXCEPT. Before: ```sql WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x -- Error: Parse error: no such table: t ``` After: ```sql WITH t AS (SELECT 1 as x) SELECT * FROM t UNION ALL SELECT 2 as x -- Works correctly, returns rows (1) and (2) ``` Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4123	2025-12-10 19:04:03 +02:00
pedrocarlo	d09b48c0e6	test page1 init	2025-12-10 12:53:25 -03:00
Nikita Sivukhin	8cc40949a5	fix clippy	2025-12-10 15:08:44 +04:00
Nikita Sivukhin	e70428e976	add explicit insert action to the fuzz test and disable it for now	2025-12-10 14:53:32 +04:00
Nikita Sivukhin	70b1e5716d	add fuzz test which maintain sqlite3 and turso db and periodically switch them between each other in order to validate compatibility	2025-12-10 14:53:32 +04:00
Nikita Sivukhin	9acf541e28	add compatibility test for multiple-columns unique constraint	2025-12-10 01:46:25 +04:00
Martin Mauch	3dc8fed204	Fix CTE scope propagation for compound SELECTs	2025-12-09 13:09:55 +01:00
Jussi Saurio	2aefb4ee8c	Merge 'fix/btree: disable move_to_rightmost optimization with triggers' from Jussi Saurio Some checks are pending Build & publish @tursodatabase/database / db-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / db-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-aarch64-apple-darwin - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-aarch64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-wasm32-wasip1-threads - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-x86_64-pc-windows-msvc - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / sync-bindings-x86_64-unknown-linux-gnu - node@20 (push) Waiting to run Details Build & publish @tursodatabase/database / Test DB bindings on Linux-x64-gnu - node@20 (push) Blocked by required conditions Details Build & publish @tursodatabase/database / Test DB bindings on browser@20 (push) Blocked by required conditions Details Build & publish @tursodatabase/database / Publish (push) Blocked by required conditions Details Python / configure-strategy (push) Waiting to run Details Python / test (push) Blocked by required conditions Details Python / lint (push) Waiting to run Details Python / linux (x86_64) (push) Waiting to run Details Python / macos-arm64 (aarch64) (push) Waiting to run Details Python / sdist (push) Waiting to run Details Python / Release (push) Blocked by required conditions Details Rust / cargo-fmt-check (push) Waiting to run Details Rust / build-native (blacksmith-4vcpu-ubuntu-2404) (push) Waiting to run Details Rust / build-native (macos-latest) (push) Waiting to run Details Rust / build-native (windows-latest) (push) Waiting to run Details Rust / clippy (push) Waiting to run Details Rust / simulator (push) Waiting to run Details Rust / test-limbo (push) Waiting to run Details Rust / test-sqlite (push) Waiting to run Details Rust Benchmarks+Nyrkiö / bench (push) Waiting to run Details Rust Benchmarks+Nyrkiö / clickbench (push) Waiting to run Details Rust Benchmarks+Nyrkiö / tpc-h-criterion (push) Waiting to run Details Rust Benchmarks+Nyrkiö / tpc-h (push) Waiting to run Details Rust Benchmarks+Nyrkiö / vfs-bench-compile (push) Waiting to run Details ## Closes - Closes #4017 - Addresses #4043; this now fails with `Page cache is full` with 100k pages, which is a separate non-corruption issue. Modifying max page cache size to be 10 million pages makes it not finish at all. We should modify the issue after this is merged to reflect what the new problem is. The queries in the issue (#4043) create a WAL that is at least 1.7 GB in size ## Background We have an optimization in the btree where if: - We want to reach the rightmost leaf page, and - We know the rightmost page and are already on it Then we can skip a seek. ## Problem The problem is this optimization should NEVER be used in cases where we cannot be sure that the btree wasn't modified from under us e.g. by a trigger subprogram. ## Fix Hence, disable it when we are executing a parent program that has triggers which will fire. ## AI Disclosure No AI was used for this PR. Reviewed-by: Preston Thorpe <preston@turso.tech> Closes #4135	2025-12-09 10:02:11 +02:00
Jussi Saurio	2e8b771f6f	Merge 'Fix descending index scan returning rows when seek key is NULL' from Jussi Saurio Closes #4066 Closes #4129 ## Problem Take e.g. CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc); INSERT INTO t values (1),(2),(3); SELECT * FROM t WHERE x > NULL; -- Our plan, like Sqlite, was to start iterating the descending index from the beginning (Rewind) and stop once we hit a row where x is <= than NULL using `IdxGe` instruction (GE in descending indexes means LE). However, `IdxGe` and other similar instructions use a sort comparison where NULL is less than numbers/strings etc, so this would incorrectly not jump. ## Fix Fix: we need to emit an explicit NULL check after rewinding. ## Tests Added TCL tests + improved `index_scan_compound_key_fuzz` to have NULL seek keys sometimes. ## AI disclosure I started debugging this with Claude Code thinking this is a much deeper corruption issue, but Opus 4.5 noticed immediately that we are returning rows from a `x > NULL` comparison which should never happen. Hence, the fix was then fairly simple. Closes #4132	2025-12-09 09:38:18 +02:00
Jussi Saurio	201a7e6387	Regression test for 4017	2025-12-09 09:19:37 +02:00
Nikita Sivukhin	997a07cac9	add test with concurrent commit/rollback and insert stmt	2025-12-08 16:34:07 +04:00
Jussi Saurio	027ebe33fe	Fix descending index scan returning rows when seek key is NULL Take e.g. CREATE TABLE t(x); CREATE INDEX txdesc ON t(x desc); INSERT INTO t values (1),(2),(3); SELECT * FROM t WHERE x > NULL; -- Our plan, like Sqlite, was to start iterating the descending index from the beginning (Rewind) and stop once we hit a row where x is <= than NULL using `IdxGe` instruction (GE in descending indexes means LE). However, `IdxGe` and other similar instructions use a sort comparison where NULL is less than numbers/strings etc, so this would incorrectly not jump. Fix: we need to emit an explicit NULL check after rewinding.	2025-12-08 13:19:58 +02:00
Jussi Saurio	826ca4d44d	chore: remove experimental_indexes feature flags	2025-12-08 13:00:37 +02:00
Preston Thorpe	c09c30746e	Merge 'guard subjournal access within single connection' from Nikita Sivukhin Right now turso can panic with various asserts if 2 or more write statements will be executed over single connection concurrently: ``` thread 'query_processing::test_write_path::api_misuse' panicked at core/storage/pager.rs:776:9: subjournal offset should be 0 ``` This PR adds explicit guard for subjournal access which will return `Busy` for the operation internally and lead to wait condition for the statement until subjournal ownership will be released and can be re- acquired again. Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4110	2025-12-05 13:14:07 -05:00
Preston Thorpe	e7c7f232b4	Merge 'testing/fuzz: Add new fuzzer for joins' from Preston Thorpe needed for #4063 to merge, currently passing on main but just want to lower the already huge diff Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com> Closes #4103	2025-12-05 13:13:44 -05:00
Nikita Sivukhin	659ef7c079	fix clippy	2025-12-05 21:39:35 +04:00
Nikita Sivukhin	487854e6d6	guard subjournal access in order to prevent concurrent operations over it within same connection	2025-12-05 21:25:13 +04:00
PThorpe92	c9a6827011	Extract out join fuzzer to an additional test on indexed columns	2025-12-05 12:03:23 -05:00
Jussi Saurio	a90087bcf6	Enable compound_select_fuzz for mvcc because it works as a regression test for #4108	2025-12-05 17:19:05 +02:00
Nikita Sivukhin	d5f58de801	fix clippy	2025-12-05 15:29:17 +04:00
Nikita Sivukhin	e839eb499b	make fuzzer to generate SELECT COUNT() OR SELECT statements - this is important for IN operation translation bug because in case of COUNT(*) there is constant assignment instruction right after last instruction translated from IN condition	2025-12-05 14:48:59 +04:00
Jussi Saurio	eb782ce2d4	fix/mvcc: seek() must seek from both mv store and btree for example, upon opening an existing database, all the rows are in the btree, so if we seek only from MV store, we won't find anything. ergo: we must look from both the mv store and the btree. if we are iterating forwards, the smallest of the two results is where we land, and vice versa for backwards iteration. initially this implementation used blocking IO but was refactored to use state machines after the rest of the Cursor methods in the MVCC cursor module were refactored to do that too. --- this PR was initially almost entirely written using Claude Code + Opus 4.5, but heavily manually cleaned up as the AI made the state machine refactor far too complicated.	2025-12-05 11:53:16 +02:00

1 2 3 4 5 ...

527 commits