## Description
This PR greatly simplifies the slot bitmap used to track free pages in
the arena buffer pool.
## Motivation and context
An optimization was included that would allow for allocating multiple
contiguous buffers, with the objective being that they would be
coalesced into single buffers when submitting `pwritev` calls, for
things like WAL appends. This optimization was never implemented and we
are left with a very complex bitmap with lots of unused/unnecessary
logic.
## Description of AI Usage
This was mostly codex 5.2 with the prompt:
```
This project is a SQLite rewrite in Rust, it uses a BufferPool that allocates large arenas
and tracks which slots are free using a bitmap. this bitmap is core/storage/slot_bitmap.rs..
it was originally designed in a way that would allow to request multiple buffers that
were contiguous in memory, so that they could be coalesced into a single `pwrite` operation later down the line.
However this optimization was never implemented and the bitmap has a complex 'two-pointer' algorithm that we no
longer need. please rewrite this slot_bitmap.rs to simplify and only allocate single buffers at a time, removing the need
for the two pointer hint system.
```
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4277
## Description
Remove sync IO hacks for `op_journal_mode`
Close#4268
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Remove sync io hacks so it is friendlier for WASM
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Ai basically made the bulk refactoring and I made some adjustments and
trimmed down the implementation
**Prompt**:
```
if look at @core/storage/journal_mode.rs and `op_journal_mode` in `execute.rs` you will see that we have some blocking io operations with
`pager.io.block` and `program.connection.checkpoint` that also blocks. I want you refactor the code to use state machines similar in nature to how we do it
in many functions in `execute.rs`
```
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4279
## Description
We should correctly check for completion success
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
Ai made the tests for me
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4263
## Background
In the simulator, we do our best to replicate the effects of an
interactive transaction into the simulator's shadow state whenever that
transaction commits.
## Problem
this didn't work:
BEGIN;
ALTER TABLE t ADD COLUMN foo;
DELETE FROM t WHERE bar != 5;
COMMIT;
None of the rows where bar != 5 were deleted because apply_snapshot()
was checking that the rows in the committed table were exactly equal to
the rows that were recorded, but since the recorded deletes contained a
NULL `foo` column, they never matched. This meant that the sim thought
it should still have all the rows that were deleted.
## Fix:
like all the other operations, record add column / drop column too so
that they are applied in sequential order in apply_snapshot()
No explicit test for this - I ran into this in another branch of mine
whose seed doesn't reproduce on main (because I changed the simulator in
that branch).
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#4264
## Description
closes https://github.com/tursodatabase/turso/issues/4142
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
compatibility, we were wrongly rewriting table qualified cols, also
added trigger.test to all.test and expect correct values in a test
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## AI Disclosure
None
<!--
Please disclose if any LLM's were used in the creation of this PR and to
what extent,
to help maintainers properly review.
-->
Closes#4206
## Description
The PR title. `exec_rows` also does validation of outputs automatically
which is good practice for testing
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
Better typing and don't have to constantly match on `turso_core::Value`
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## AI Disclosure
Ai did most of the migration
<!--
Please disclose if any LLM's were used in the creation of this PR and to
what extent,
to help maintainers properly review.
-->
Closes#4192
## Description
Connection has grown to the point where a separate module to make future
refactoring easier makers sense.
Replaced manually getting the inner connection with all the
locking/error handling with a utility method. This will be handy for the
ConnectionPool work
## Motivation and context
Foundation work for the ConnectionPool and removing an unwrap
## AI Disclosure
None
Reviewed-by: Mikaël Francoeur (@LeMikaelF)
Closes#4187
## Description
Closes#4146
## Motivation and context
panics are bad
## AI Disclosure
none used
Reviewed-by: Pedro Muniz (@pedrocarlo)
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4177
## Description
I'm working on a branch that fixes/improves on our `io_uring` backend
and fixes a lot of the existing issues we have, I wanted to be able to
test it thoroughly so I thought I'd have opus 4.5 add it to the
simulator while I'm working so I could test the upcoming branch/PR.
## Description of AI Usage
This was mostly generated by Opus 4.5/claude code 🤖
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#4274
Closes#3536
# Description
This PR implements **dynamic journal mode switching** via `PRAGMA
journal_mode`, allowing users to switch between WAL and MVCC modes at
runtime.
### Key Changes
**Core Feature: Journal Mode Switching**
- Added new `JournalMode` module (`core/storage/journal_mode.rs`) to
parse and handle journal mode transitions
- Modified `op_journal_mode` to correctly parse journal modes and update
the database header
- Emit checkpoint when setting a new journal mode to ensure data
consistency
- Added MVCC checkpoint support to `Connection::checkpoint`
**Database Initialization Improvements**
- Read DB header on `Database::open` and simplified `init_pager`
- Made `Version` an enum for better comparison semantics
- Automatically convert legacy SQLite databases to WAL mode
- Ensure DB header is flushed to disk when header changes during open
- Clear page cache after header validation
**Bug Fixes**
- Fixed dirty page invalidation in pager when clearing dirty pages in
page cache
- Fixed `is_none_or` check for row version existence in DB file (handles
MvStore initialization and empty database cases)
- Added `btree_resident` field in `RowVersion` to track if
insert/deletion originated from a btree
**Testing**
- Added fuzz tests for `journal_mode` transitions with Database
operations in between
- Added integration tests for testing switching from the different modes
while checking the header version is correct
- Added some specific regression tests for delete operations lost on
mode switch
- Fixed `index_scan_compound_key_fuzz` to use separate databases for
Turso and SQLite in MVCC mode. Also had to decrease number of rows for
MVCC test, as insert was very slow.
# TODO's
- Remove sync hacks from `op_journal_mode`
- Expand fuzzer with different queries
- Add to Simulator
- Special handling for read only databases and not allow any header
changes
# Motivation and context
Facilitate our users to test MVCC and transition back and forth from it.
# AI Disclosure
Used AI to catch and fix bugs in MVCC, further my understanding with
MVCC, write tests in `tests` folder, most of the PR summary, and the
docs in the `docs/manual.md` file
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4074
## Description
This PR (finally) implements spilling to disk (WAL for the common case,
or to temp files for ephemeral tables/indexes)
## Motivation and context
closes https://github.com/tursodatabase/turso/issues/1661
(can run the above referenced query with cache_size=200, producing the
same results as SQLite)
## How this was tested
I left running overnight with `cache_size=200` (which is the new minimum
page cache size) for the following:
```console
for i in (seq 100); ./scripts/run-sim --maximum-tests 5000 --min-tick 10 --max-tick 50 --memory-io --differential --profile write_heavy || break; end;
```
And again with normal profile:
```console
for i in (seq 100); ./scripts/run-sim --maximum-tests 5000 --min-tick 10 --max-tick 50 --memory-io --differential || break; end
```
And no issues were found.
## New simulator profile
This PR also adds `write_heavy_spill` profile to the simulator, and adds
an additional run in CI for it. All it does is set the `cache_size=200`
before kicking off a `write_heavy` profile run.
## AI Disclosure
Some general guidance was provided and it did help locate a bug but
pretty much all the code here is hand written.
Closes#4211
This is already done in abort() but if the checkpoint error is
returned directly from commit_tx() instead of in an IO completion,
then the Tx state will be cleared and the DDL change won't be applied.
So, do it explicitly in step_end_write_txn() too.
Fixes this failing seed:
cargo run -- --memory-io --seed 12902559987470385066 --disable-
heuristic-shrinking
Closes#4265
This is already done in abort() but if the checkpoint error is
returned directly from commit_tx() instead of in an IO completion,
then the Tx state will be cleared and the DDL change won't be applied.
So, do it explicitly in step_end_write_txn() too.
Fixes this failing seed:
cargo run -- --memory-io --seed 12902559987470385066 --disable-heuristic-shrinking
## Description
We had a turso_stress failure happening on Antithesis. See the
associated issue for how to reproduce it. It happens only on io_uring.
This PR fixes the failure by ensuring writes to the WAL are completed
before the frame cache is updated. Without this, other threads can
retrieve a frame from the cache before the frame has been persisted.
## Motivation and context
Closes https://github.com/tursodatabase/turso/issues/4249
## Description of AI Usage
I used the reproducer in
https://github.com/tursodatabase/turso/issues/4249, and told Claude to
fix it. I then reviewed its work to make sure I understood it.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4252
this didn't work:
BEGIN;
ALTER TABLE t ADD COLUMN foo;
DELETE FROM t WHERE bar != 5;
COMMIT;
None of the rows where bar != 5 were deleted because apply_snapshot()
was checking that the rows in the committed table were exactly equal
to the rows that were recorded, but since the recorded deletes contained
a NULL `foo` column, they never matched. This meant that the sim thought
it should still have all the rows that were deleted.
Fix:
like all the other operations, record add column / drop column too so
that they are applied in sequential order in apply_snapshot()
No explicit test for this - I ran into this in another branch of mine
whose seed doesn't reproduce on main (because I changed the simulator
in that branch).
the DB file, as if self.checkpointed_txid_max_old == None it could mean
the MvStore recently initialized or we are dealing with an empty
database. In both cases, we cannot assert the row version exists in the
db file