LLKV: Arrow-Native SQL over Key-Value Storage
Work in Progress
llkv is the primary entrypoint crate for the LLKV database toolkit. It provides both a command-line interface and a library that re-exports high-level APIs from the underlying workspace crates.
Arrow column chunks are stored under pager-managed physical keys, so pagers that already offer zero-copy reads—like the SIMD-backed simd-r-drive—can return contiguous buffers that stay SIMD-friendly.
Design Notes
- Execution stays synchronous by default. Query workloads rely on hot-path parallelism from Rayon and Crossbeam instead of a global async runtime so scheduler costs stay predictable, while still embedding cleanly inside Tokio—the SLT harness uses Tokio to stage concurrent connection simulations.
- Storage is built atop the simd-r-drive project rather than Parquet, prioritizing fast point updates over integration with existing data lake tooling.
- The crate shares Apache DataFusion's SQL parser and Arrow memory model but intentionally avoids Tokio. The goal is to explore how far a DataFusion-like stack can go without a task scheduler while still shipping MVCC transactions and sqllogictest coverage.
- LLKV remains alpha software. DataFusion offers a broader production footprint today, while this project trails new ideas for specialized workloads.
Command-Line Interface
The llkv binary supports three modes:
Interactive REPL
Start an interactive session with an in-memory database:
Available commands:
.help— Show usage information.open FILE— Open a persistent database file (planned).exitor.quit— Exit the REPL- SQL statements are executed directly
Stream Processing
Pipe SQL scripts to stdin for batch execution:
|
SQL Logic Test Runner
Execute SLT test files or directories:
# Run a single test
# Run a test directory
# Use multi-threaded runtime
Library Usage
The llkv crate re-exports the core SQL engine and storage abstractions for programmatic use.
Quick Start
use Arc;
use ;
// Create an in-memory SQL engine
let engine = new;
// Execute SQL statements
let results = engine.execute.unwrap;
let results = engine.execute.unwrap;
// Query data
let batches = engine.sql.unwrap;
Re-exported APIs
The following is a non-exhaustive list of some of the APIs this crate re-exports. Enable the simd-r-drive-support feature flag when you need durable storage via the SIMD R-Drive pager.
SqlEngine— Main SQL execution engine fromllkv-sqlstorage::MemPager— In-memory pager for transient databasesstorage::Pager— Trait for custom storage backendsstorage::SimdRDrivePager— Durable pager backed bysimd-r-drive(requires thesimd-r-drive-supportfeature)Error/Result— Error handling types
Using Persistent Storage
For persistent databases, enable the simd-r-drive-support feature and use a file-backed pager. Replace the version tag with the latest release published on crates.io when you upgrade:
[]
= { = "0.8.5-alpha", = ["simd-r-drive-support"] }
use Arc;
use ;
let pager = open.unwrap;
let engine = new;
Architecture
LLKV is organized as a layered workspace with each crate focused on a specific responsibility:
- SQL Interface (
llkv-sql) — Parses SQL and manages theSqlEngineAPI - Planning (
llkv-plan,llkv-expr) — Logical plans and expression ASTs - Runtime (
llkv-runtime,llkv-transaction) — Transaction orchestration and MVCC - Execution (
llkv-executor,llkv-aggregate,llkv-join) — Query evaluation and streaming results - Storage (
llkv-table,llkv-column-map,llkv-storage) — Columnar storage and pager abstractions
See the workspace root README and DeepWiki documentation for detailed architecture information.
License
Licensed under the Apache-2.0 License.