Commit graph

308 commits

Author SHA1 Message Date
Pekka Enberg
90c1e3fc06 Switch Connection to use Arc instead of Rc
Connection needs to be Arc so that bindings can wrap it with `Mutex` for
multi-threading.
2025-06-16 10:43:19 +03:00
PThorpe92
e134bd19da Remove close from Drop impl 2025-06-13 11:11:30 +03:00
PThorpe92
9f966910bc Add manual wal sync before checkpoint in connection Drop 2025-06-13 11:11:30 +03:00
Anton Harniakou
74d4726b0c Use expect to get a better error message if accessing unavailable column 2025-06-09 10:40:04 +03:00
meteorgan
a242bac340 Fix: ensure PRAGMA cache_size changes persist only for current session 2025-06-05 16:55:41 +08:00
Pekka Enberg
c6ef19396d Merge 'Add support for pragma table-valued functions' from Piotr Rżysko
This PR adds support for table-valued functions for PRAGMAs (see the
[PRAGMA functions section](https://www.sqlite.org/pragma.html)).
Additionally, it introduces built-in table-valued functions. I
considered using extensions for this, but there are several reasons in
favor of a dedicated mechanism:
* It simplifies the use of internal functions, structs, etc. For
example, when implementing `json_each` and `json_tree`, direct access to
internals was necessary:
https://github.com/tursodatabase/limbo/pull/1088
* It avoids FFI overhead. [Benchmarks](https://github.com/piotrrzysko/li
mbo/blob/pragma_vtabs_bench/core/benches/pragma_benchmarks.rs) on my
hardware show that `pragma_table_info()` implemented as an extension is
2.5× slower than the built-in version.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1642
2025-06-04 09:08:10 +03:00
Jussi Saurio
ea301de726 Merge 'Pass input string to translate function' from Pedro Muniz
In preparation for `CREATE VIEW`, we need to have the original sql query
that was used to create the view. I'm using the scanner's offset to
slice into the original input, trimming the newlines, and passing it to
the translate function.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1621
2025-06-02 17:43:11 +03:00
Piotr Rzysko
d1d8ead475 Add support for pragma table-valued functions 2025-06-01 10:25:42 +02:00
Piotr Rzysko
149375b2b4 Extract VirtualTable to a separate module 2025-06-01 07:45:57 +02:00
pedrocarlo
bc563266b3 add instrumentation to more functions for debugging + adjust how cursors are opened 2025-05-30 20:35:50 -03:00
pedrocarlo
b73200de86 pass input string to translate function 2025-05-30 11:20:36 -03:00
Pere Diaz Bou
da4190a23e Convert u64 rowid to i64
Rowids can be negative, therefore let's swap to i64
2025-05-30 13:07:31 +02:00
Jussi Saurio
cc405dea7e Use new TableReferences struct everywhere 2025-05-29 11:44:56 +03:00
Pere Diaz Bou
28bd24b7d4 clear page cache on transaction failure
This is the first step towards rollback, since we still don't spill
pages with WAL, we can simply invalidate page cache in case of failure.
2025-05-28 15:54:28 +02:00
Pekka Enberg
8d7f20b7d2 Merge 'Add libsql_wal_get_frame() API' from Pekka Enberg
This pull request implements the `libsql_wal_get_frame()` API. To do
that, we also introduce a `wait_for_completion()` API in I/O dispatcher.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1533
2025-05-27 18:17:32 +03:00
Pekka Enberg
3250560eb8 sqlite3: Add libsql_wal_get_frame() API 2025-05-27 13:47:40 +03:00
Pekka Enberg
eca9a5b703 core/io: Switch to Arc<Completion> 2025-05-27 11:28:49 +03:00
Jussi Saurio
b72b99c973 Merge 'feature: INSERT INTO <table> SELECT' from Pedro Muniz
Closes #1528 .
- Modified `translate_select` so that the caller can define if the
statement is top-level statement or a subquery.
- Refactored `translate_insert` to offload the translation of multi-row
VALUES and SELECT statements to `translate_select`
- I did not try to change much of `populate_column_registers` as I did
not want to break `translate_virtual_table_insert`. Ideally, I would
want to unite this remaining logic folding `populate_column_registers`
into `populate_columns_multiple_rows` and the
`translate_virtual_table_insert` into `translate_insert`. But, I think
this may be best suited for a separate PR.
## TODO
- ~Tests~ - *Done*
- ~Need to emit a temp table when we are selecting and inserting into
the Same Table -
https://github.com/sqlite/sqlite/blob/master/src/insert.c#L1369~ -
*Done*
- Optimization when table have the exact same schema - open an Issue
about it
- Virtual Tables do not benefit yet from this feature - open an Issue
about it

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1566
2025-05-27 10:50:26 +03:00
Jussi Saurio
3ba9f2ab97 Small cleanups to pager/wal/vdbe - mostly naming
- Instead of using a confusing CheckpointStatus for many different things,
  introduce the following statuses:
    * PagerCacheflushStatus - cacheflush can result in either:
      - the WAL being written to disk and fsynced
      - but also a checkpoint to the main BD file, and fsyncing the main DB file

      Reflect this in the type.
    * WalFsyncStatus - previously CheckpointStatus was also used for this, even
      though fsyncing the WAL doesn't checkpoint.
    * CheckpointStatus/CheckpointResult is now used only for actual checkpointing.

- Rename HaltState to CommitState (program.halt_state -> program.commit_state)
- Make WAL a non-optional property in Pager
  * This gets rid of a lot of if let Some(...) boilerplate
  * For ephemeral indexes, provide a DummyWAL implementation that does nothing.
- Rename program.halt() to program.commit_txn()
- Add some documentation comments to structs and functions
2025-05-26 10:37:34 +03:00
pedrocarlo
bb7da39c72 remove assumption that translate_select is always called from a top-level context + adjust insert to use translate_select when needed 2025-05-25 19:12:30 -03:00
Jussi Saurio
7c07c09300 Add stable internal_id property to TableReference
Currently our "table id"/"table no"/"table idx" references always
use the direct index of the `TableReference` in the plan, e.g. in
`SelectPlan::table_references`. For example:

```rust
Expr::Column { table: 0, column: 3, .. }
```

refers to the 0'th table in the `table_references` list.

This is a fragile approach because it assumes the table_references
list is stable for the lifetime of the query processing. This has so
far been the case, but there exist certain query transformations,
e.g. subquery unnesting, that may fold new table references from
a subquery (which has its own table ref list) into the table reference
list of the parent.

If such a transformation is made, then potentially all of the Expr::Column
references to tables will become invalid. Consider this example:

```sql
-- Assume tables: users(id, age), orders(user_id, amount)

-- Get total amount spent per user on orders over $100
SELECT u.id, sub.total
FROM users u JOIN
     (SELECT user_id, SUM(amount) as total
      FROM orders o
      WHERE o.amount > 100
      GROUP BY o.user_id) sub
WHERE u.id = sub.user_id

-- Before subquery unnesting:
-- Main query table_references: [users, sub]
-- u.id refers to table 0, column 0
-- sub.total refers to table 1, column 1
--
-- Subquery table_references: [orders]
-- o.user_id refers to table 0, column 0
-- o.amount refers to table 0, column 1
--
-- After unnesting and folding subquery tables into main query,
-- the query might look like this:

SELECT u.id, SUM(o.amount) as total
FROM users u JOIN orders o ON u.id = o.user_id
WHERE o.amount > 100
GROUP BY u.id;

-- Main query table_references: [users, orders]
-- u.id refers to table index 0 (correct)
-- o.amount refers to table index 0 (incorrect, should be 1)
-- o.user_id refers to table index 0 (incorrect, should be 1)
```

We could ofc traverse every expression in the subquery and rewrite
the table indexes to be correct, but if we instead use stable identifiers
for each table reference, then all the column references will continue
to be correct.

Hence, this PR introduces a `TableInternalId` used in `TableReference`
as well as `Expr::Column` and `Expr::Rowid` so that this kind of query
transformations can happen with less pain.
2025-05-25 20:26:17 +03:00
Jussi Saurio
f388bc571e Merge 'xConnect for virtual tables to query core db connection' from Preston Thorpe
Re-Opening #1076 because it had bit-rotted to a point of no return.
However it has improved. Now with Weak references and no incrementing Rc
strong counts.
This also includes a better test extension that returns info about the
other tables in the schema.
![image](https://github.com/user-
attachments/assets/4292dc9c-121e-4ba2-8a51-4533bbcf2afd)
(theme doesn't show rows column)

Closes #1366
2025-05-25 14:37:38 +03:00
PThorpe92
cf163f2dc0
Prevent double free in ext connection 2025-05-24 16:49:52 -04:00
PThorpe92
d63f9d8cff
Make sure all resources are cleaned up properly in xconnect 2025-05-24 16:38:33 -04:00
PThorpe92
d11ef6b9c5
Add execute method to xConnect db interface for vtables 2025-05-24 14:49:58 -04:00
PThorpe92
c2ec6caae1
Finish integrating xConnect into vtable open api 2025-05-24 14:49:58 -04:00
Pere Diaz Bou
54b1647148 set non-shared cache by default
Shared cache requires more locking mechasnisms. We still have multi
threading issues not related to shared cache so it is wise to first fix
those and then once they are fixed, we can incrementally add shared
cache back with locking in place.
2025-05-24 11:59:54 +02:00
Jussi Saurio
8bec75d804 Merge 'Initial Support for Nested Translation' from Pedro Muniz
This PR introduces some modifications to the Program Builder to allow us
to use nested parsing. By focusing the emission of Init and the last
Goto (prologue and epilogue), inside the ProgramBuilder, we can just not
emit them if we are parsing/translating in a nested context. For this
PR, I only migrated insert to use these functions as I need them to
support Insert statements that use `SELECT FROM` syntax. Nested parsing
overall enables code reuse for us and arguably is one of the only ways
to parse deeply nested queries without a lot of code duplication.
#1528

Closes #1543
2025-05-22 10:52:00 +03:00
Jussi Saurio
c7f984c5c8 Merge 'Page cache fixes' from Pere Diaz Bou
This PR builds on top of
https://github.com/tursodatabase/limbo/pull/1368 and adds few things
like allowing inserting pages with the same page key, fix fuzz tests by
adding transactions and some minor improvements to cacheflush.

Closes #1523
2025-05-22 10:12:56 +03:00
Jussi Saurio
fc150b12c9 Merge 'CSV virtual table extension' from Piotr Rżysko
This PR adds a port of [SQLite's CSV virtual table
extension](https://www.sqlite.org/csv.html).
Planned follow-ups:
* Pass detailed error messages from `VTabModule::create`, not just
`ResultCode`s.
* Address the TODO in `VTabModuleImpl::create_schema`.

Reviewed-by: Jussi Saurio <jussi.saurio@gmail.com>

Closes #1544
2025-05-22 09:48:53 +03:00
pedrocarlo
d21229d4a3 create inner translate function to enable calling it from a nested context 2025-05-21 14:08:02 -03:00
pedrocarlo
f5d6d11d16 extract prologue and epilogue to program builder 2025-05-21 12:47:51 -03:00
pedrocarlo
517c7c81cd refactor to include optional program builder argument 2025-05-21 12:47:51 -03:00
Pere Diaz Bou
591c674e86 Introduce PageRef wrapper BTreePage.
One problem we have with PageRef, is that this Page reference can be
unloaded, this means if we read the page again instead of loading the
page onto the same reference, we will have split brain of references.

To solve this we wrap PageRef in `BTreePage` so that if a page is seen
as unloaded, we will replace BTreePage::page with the newest version of
the page.
2025-05-21 14:19:41 +02:00
Pere Diaz Bou
35f7317724 add default page cache 2025-05-21 14:11:21 +02:00
Pekka Enberg
580d55f255 Merge 'bindings/rust: Add pragma methods' from Diego Reis
I tried to be the most similar to rusqlite as possible. The only thing
that's bothering me is `Vec<Vec<Value>>` which I think can be improved
but not so sure how, any inputs on this are welcomed.

Closes #1536
2025-05-21 12:38:34 +03:00
Piotr Rzysko
9c1dca72db Introduce VTable
This allows storing table arguments parsed in the VTabModule::create
method.
2025-05-21 08:33:17 +02:00
Diego Reis
44541cb0d5 wip: Add more pragma methods 2025-05-20 09:50:05 -03:00
Diego Reis
4766c9c286 bind/rust: Fix lifetime issue with pragma_query
Shallow cloning in Row ended up invalidating the pointer
to value
2025-05-19 21:29:07 -03:00
Diego Reis
ed0e3b1ba2 bind/rust: Implement pragma_query 2025-05-19 14:04:59 -03:00
Jussi Saurio
416de9dd9c Extract page cache size constant and bump to 2k 2025-05-16 15:40:19 +03:00
Jussi Saurio
b16044f34b pager: bump default page cache size from 10 to 1000 pages
```
Gnuplot not found, using plotters backend
Execute `SELECT count() FROM users`/limbo_execute_select_count
                        time:   [12.867 µs 12.958 µs 13.104 µs]
                        change: [-91.233% -91.178% -91.120%] (p = 0.00 < 0.05)
                        Performance has improved.
```
2025-05-16 09:23:42 +03:00
Pekka Enberg
524a523036 sqlite3: Add libsql_wal_frame_count() API 2025-05-15 11:43:44 +03:00
Pekka Enberg
e3f71259d8 Rename OwnedValue -> Value
We have not had enough merge conflicts for a while so let's do a
tree-wide rename.
2025-05-15 09:59:46 +03:00
Diego Reis
07bfeadd56 core: Simplify error handling of malformed strings for prepared statements 2025-05-12 13:25:11 -03:00
Diego Reis
f7ab8b11d6 cargo fmt 2025-05-12 10:56:53 -03:00
Diego Reis
c4e7be04f8 core: Handles prepared statement with empty SQL 2025-05-12 10:38:58 -03:00
Piotr Rzysko
977b6b331a Fix memory leak caused by unclosed virtual table cursors
The following code reproduces the leak, with memory usage increasing
over time:

```
#[tokio::main]
async fn main() {
    let db = Builder::new_local(":memory:").build().await.unwrap();
    let conn = db.connect().unwrap();

    conn.execute("SELECT load_extension('./target/debug/liblimbo_series');", ())
        .await
        .unwrap();

    loop {
        conn.execute("SELECT * FROM generate_series(1,10,2);", ())
            .await
            .unwrap();
    }
}
```

After switching to the system allocator, the leak becomes detectable
with Valgrind:

```
32,000 bytes in 1,000 blocks are definitely lost in loss record 24 of 24
   at 0x538580F: malloc (vg_replace_malloc.c:446)
   by 0x62E15FA: alloc::alloc::alloc (alloc.rs:99)
   by 0x62E172C: alloc::alloc::Global::alloc_impl (alloc.rs:192)
   by 0x62E1530: allocate (alloc.rs:254)
   by 0x62E1530: alloc::alloc::exchange_malloc (alloc.rs:349)
   by 0x62E0271: new<limbo_series::GenerateSeriesCursor> (boxed.rs:257)
   by 0x62E0271: open_GenerateSeriesVTab (lib.rs:19)
   by 0x425D8FA: limbo_core::VirtualTable::open (lib.rs:732)
   by 0x4285DDA: limbo_core::vdbe::execute::op_vopen (execute.rs:890)
   by 0x42351E8: limbo_core::vdbe::Program::step (mod.rs:396)
   by 0x425C638: limbo_core::Statement::step (lib.rs:610)
   by 0x40DB238: limbo::Statement::execute::{{closure}} (lib.rs:181)
   by 0x40D9EAF: limbo::Connection::execute::{{closure}} (lib.rs:109)
   by 0x40D54A1: example::main::{{closure}} (example.rs:26)
```

Interestingly, when using mimalloc, neither Valgrind nor mimalloc’s
internal statistics report the leak.
2025-05-05 21:26:23 +02:00
pedrocarlo
0c22382f3c shared lock on file and throw ReadOnly error in transaction 2025-05-02 16:30:48 -03:00
meteorgan
d2dce740f7 fix some issues about page_size 2025-04-28 16:13:07 +08:00