## Description
BTreeCursor sets null flag to false once `seek` is called. This PR does
the same for MVCC
## Motivation and context
join.test failed with some cases due to this bug
## Description of AI Usage
I asked AI to find the issue but I ended showing the agent why he did
things wrong and that he should be ashamed
Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>
Closes#4296
`resume` was not re-entrant and panicked in case when was called after
operation completion.
```
thread '<unnamed>' panicked at sync/sdk-kit/src/turso_async_operation.rs:59:20:
`async fn` resumed after completion
...
```
This PR makes `resume` method for `PyTursoAsyncOperation` re-entrant
(like in the c-api) and also convert exception type in the sync module
Closes#4315
## Description
This PR adds missing affinity conversion to hash joins by applying
affinity conversion to build and probe keys before hashing.
```
turso> CREATE TABLE x(a INTEGER);
turso> CREATE TABLE y(b TEXT);
turso> INSERT INTO x VALUES (2),(3);
turso> INSERT INTO y VALUES ('02'),('2'),('2.0'),('3x'),('3.5');
turso> SELECT a, b
FROM x JOIN y ON a = b
ORDER BY a, b;
┌───┬─────┐
│ a │ b │
├───┼─────┤
│ 2 │ 02 │
├───┼─────┤
│ 2 │ 2 │
├───┼─────┤
│ 2 │ 2.0 │
└───┴─────┘
```
## Motivation and context
Fixes#3482.
Currently, Turso returns an empty result set:
```
turso> CREATE TABLE x(a INTEGER);
turso> CREATE TABLE y(b TEXT);
turso> INSERT INTO x VALUES (2),(3);
turso> INSERT INTO y VALUES ('02'),('2'),('2.0'),('3x'),('3.5');
turso> SELECT a, b
FROM x JOIN y ON a = b
ORDER BY a, b;
turso>
```
Expected behavior:
```
sqlite> CREATE TABLE x(a INTEGER);
sqlite> CREATE TABLE y(b TEXT);
sqlite> INSERT INTO x VALUES (2),(3);
sqlite> INSERT INTO y VALUES ('02'),('2'),('2.0'),('3x'),('3.5');
sqlite> SELECT a, b
...> FROM x JOIN y ON a = b
...> ORDER BY a, b;
2|02
2|2
2|2.0
```
## Description of AI Usage
This PR was developed with assistance from Claude Sonnet 4.5 through
code completions.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4317
Set the testing DB's to WAL mode permanently so we don't keep making
changes to these every time we open them with `make test`
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#4321
## Transaction fixes
- return BUSY_SNAPSHOT instead of BUSY when transaction tries to
promote from read to write and its snapshot is stale. this is
a case where retrying with busy_timeout will never help.
- fix bug in begin_read_tx() where it would not allow to use a
readmark that is lower than shared_max (causes an enormous amount
of busy errors / contention)
- fix another bug in begin_read_tx() where it would not decrement
shared lock if it needed to abort due to stale header values
- validate shared state again after deciding to use read0 lock -
otherwise we might miss frames (another TOCTOU issue)
- implement sqlite's exponential backoff algorithm for begin_read_tx()
to improve multiple threads' ability to acquire write lock
## Fix deadlock
use with_shared() instead of with_shared_mut() for lock ops
using shared_mut() is unnecessary because the locks use atomics, and
in fact using shared_mut() can cause a deadlock, example:
- Thread 1 calls with_shared_mut() at in try_begin_read_tx
to claim/update a read mark slot. This waits for readers to release
their shared read locks.
- Thread 2 tries to use with_shared_mut() in end_read_tx() and also
can't proceed
## Perf/throughput test adjustment
If `BusySnapshot` is returned, explicitly ROLLBACK and restart the
transaction
## Review guide
I've interactive rebased this to 10 commits that are fairly sensible, so
feel free to read one-by-one.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4289
using shared_mut() is unnecessary because the locks use atomics, and
in fact using shared_mut() can cause a deadlock, example:
- Thread 1 calls with_shared_mut() at in try_begin_read_tx
to claim/update a read mark slot. This waits for readers to release
their shared read locks.
- Thread 2 tries to use with_shared_mut() in end_read_tx() and also
can't proceed
Extract try_begin_read_tx with TryBeginReadResult::Retry for transient
conditions. begin_read_tx now retries with SQLite's quadratic backoff:
immediate retries for first 5 attempts, yield for 6-9, then quadratic
microsecond delays matching SQLite's formula.
BusySnapshot indicates the transaction's snapshot is permanently stale
and must be rolled back. Unlike Busy, retrying with busy_timeout will
never help - the caller must rollback and restart the transaction.
have to leave for train soon, so claude summary:
### Bugfix 1: track CREATE TABLE, CREATE INDEX, DROP INDEX as sequential
operations
Closes#4303
Changes in runner/env.rs:
1. Renamed RowOperation to TxOperation
2. Added CreateTable, CreateIndex and DropIndex variants
3. Added corresponding record_* methods to both TransactionTables and
ShadowTablesMut
4. Updated apply_snapshot() to handle CreateTable, CreateIndex, and
DropIndex operations in order
5. Removed the index sync loop at the end of apply_snapshot() since
operations are now tracked properly
Changes in model/mod.rs:
1. Updated Shadow for Create to call record_create_table() before
applying the change
2. Updated Shadow for CreateIndex to call record_create_index() before
applying the change
3. Updated Shadow for DropIndex to call record_drop_index() before
applying the change
The core fix is that CreateTable and CreateIndex operations are now
recorded as they happen, so when apply_snapshot() replays operations in
order, subsequent operations like AlterTable find the table already
created in committed_tables.
### Bugfix 2: track table name changes in txn so ADD COLUMN && DROP
COLUMN work properly
When processing AddColumn and DropColumn operations, the code looks up
tables in transaction_tables.current_tables using the table name stored
in the operation. However, transaction_tables.current_tables has tables
with their final names (after all renames), while operations store the
name at the time they were recorded.
In the failing case:
1. DropColumn was recorded with table_name =
"stellar_katerina_672"
2. RenameTable changed the name to "captivating_eco_1159"
3. At commit, when processing DropColumn, we looked for
"stellar_katerina_672" but the table was now named
"captivating_eco_1159"
The fix: Build a mapping from old names → final names by scanning
all RenameTable operations first, then use the final name when looking
up in transaction_tables.current_tables.
Closes#4305
Reviewed-by: Pedro Muniz (@pedrocarlo)
Closes#4304
This PR fixes incorrect conversion from TEXT to INTEGER when text is a
number followed by a trailing non-breaking space.
This happens because `str::trim()` trims non-breaking space and unicode
whitespace while SQLite only trims ASCII whitespace.
Closes: https://github.com/tursodatabase/turso/issues/3679
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#3878
When processing AddColumn and DropColumn operations, the code looks up tables in transaction_tables.current_tables using the table name stored in the operation. However, transaction_tables.current_tables has tables with their final names (after all renames), while operations store the name at the time they were recorded.
In the failing case:
1. DropColumn was recorded with table_name = "stellar_katerina_672"
2. RenameTable changed the name to "captivating_eco_1159"
3. At commit, when processing DropColumn, we looked for "stellar_katerina_672" but the table was now named "captivating_eco_1159"
The fix: Build a mapping from old names → final names by scanning all RenameTable operations first, then use the final name when looking up in transaction_tables.current_tables.
have to leave for train soon, so claude summary:
Changes in runner/env.rs:
1. Renamed RowOperation to TxOperation
2. Fixed the CreateTable variant to store the full Table instead of separate fields
3. Added CreateIndex and DropIndex variants
4. Added corresponding record_* methods to both TransactionTables and ShadowTablesMut
5. Updated apply_snapshot() to handle CreateTable, CreateIndex, and DropIndex operations in order
6. Removed the index sync loop at the end of apply_snapshot() since operations are now tracked properly
Changes in model/mod.rs:
1. Updated Shadow for Create to call record_create_table() before applying the change
2. Updated Shadow for CreateIndex to call record_create_index() before applying the change
3. Updated Shadow for DropIndex to call record_drop_index() before applying the change
The core fix is that CreateTable and CreateIndex operations are now recorded as they happen, so when apply_snapshot() replays operations in order, subsequent operations like AlterTable find the table already created in committed_tables.
This PR introduces few sync fixes relevant to the partial sync feature:
1. Make `try_wal_watermark_read_page` async - this is important as now
db file can be partial and require extra network IO for fetching missing
pages
2. Do not panic if requested page doesn't exist on server - this can be
valid case if db on the server has smaller size
3. Maintain clean db file size in order to avoid reads outside of the
file (this can happen, when we revert new pages allocated only in the
WAL and need to fetch their previous state if there is one)
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4297
## Description
This PR prevents `ALTER COLUMN` from resulting in tables with only
generated columns. Currently, Turso only applies this rule in `CREATE
TABLE` but not in `ALTER COLUMN`.
Expected behavior:
```
turso> CREATE TABLE t1(a as (123));
× Parse error: must have at least one non-generated column
turso> CREATE TABLE t2(a);
turso> ALTER TABLE t2 ALTER COLUMN a TO b AS (123);
× Parse error: must have at least one non-generated column
turso> CREATE TABLE t3(a, b);
turso> ALTER TABLE t3 ALTER COLUMN a TO c AS (123);
turso> ALTER TABLE t3 ALTER COLUMN b TO d AS (123);
× Parse error: must have at least one non-generated column
```
## Motivation and context
Fixes#3653.
Currently, Turso does not restrict tables from having only generated
columns when using `ALTER COLUMN`:
```
turso> create table t(a);
turso> alter table t alter column a to b as (123);
turso> select * from sqlite_schema;
┌───────┬──────┬──────────┬──────────┬─────────────────────────────┐
│ type │ name │ tbl_name │ rootpage │ sql │
├───────┼──────┼──────────┼──────────┼─────────────────────────────┤
│ table │ t │ t │ 2 │ CREATE TABLE t (b AS (123)) │
└───────┴──────┴──────────┴──────────┴─────────────────────────────┘
```
## Description of AI Usage
This PR was developed with assistance from Claude Sonnet 4.5. The AI
helped identify the root cause and assisted in debugging issues in my
initial implementation.
Reviewed-by: Preston Thorpe <preston@turso.tech>
Closes#4257
## Description
- Move `Statement` to different file
- remove run once from `Statement`
- add helper functions to demonstrate explicit blocking behaviour
- we were not checking for naming conflicts correctly in create view and
create table, so me and Claude fixed that as well
- made adjustments some adjustments to `print_query_result` in the cli
`run_once` does not even have the most up-to-date error handling as we
have in `Program::abort`, so we really should not be using it
I guess these recent changes uncovered some subtle bugs we had that
masked the error. It did not help as well that most tcl tests expect any
error and not an error pattern, so maybe we were erroring before for
some weird reason in the cli?
Closes#2388
<!--
Please include a summary of the changes and the related issue.
-->
## Motivation and context
We should avoid having an API that actually drives the IO in Statement,
but as we have many Sync IO hacks, this is currently not possible. Due
to our embedded nature IO should be driven by the user of the library,
so we should be pushing IO stepping away from Core and towards bindings
in my opinion. I think we can accomplish this by providing blocking/busy
looping and async Statements in Core. We already almost have this
differentiation with `step_with_waker`. Such an api would also make it
easier for us use async in core in the future.
This refactor of hiding the IO stepping behind helpers is to hopefully
mitigate future refactors with async, and to fix many places that do not
run statements to completions.
<!--
Please include relevant motivation and context.
Link relevant issues here.
-->
## Description of AI Usage
None. Didn't want to risk AI doing something wrong, so I did the
refactor by hand
**EDIT:** asked it to help me slightly to help with `print_query_result`
<!--
Please disclose how AI was used to help create this PR. For example, you
can share prompts,
specific tools, or ways of working that you took advantage of. You can
also share whether the
creation of the PR was mainly driven by AI, or whether it was used for
assistance.
This is a good way of sharing knowledge to other contributors about how
we can work more efficiently with
AI tools. Note that the use of AI is encouraged, but the committer is
still fully responsible for understanding
and reviewing the output.
-->
Reviewed-by: Pere Diaz Bou <pere-altea@homail.com>
Closes#4290