pub struct Database { /* private fields */ }Expand description
The SochDB Database Kernel
This is the shared core used by both embedded (SochConnection) and
server (sochdb-server) modes. It owns all storage, catalog, and
indexing components.
§Thread Safety
The Database is fully thread-safe via internal synchronization:
- Multiple readers can operate concurrently (MVCC snapshots)
- Writers coordinate through WAL and group commit
- All state is behind Arc/RwLock for shared access
§Concurrency Modes
§Standard Mode (Single Process)
- Uses exclusive file lock (
flock(LOCK_EX)) - Best for: Scripts, notebooks, CLI tools
- Open with:
Database::open(path)
§Concurrent Mode (Multi-Process/Web Apps)
- Uses lock-free MVCC for reads, single-writer coordination for writes
- Best for: Web servers, Flask/FastAPI apps, hot reloading
- Open with:
Database::open_concurrent(path)
§Example
// Standard mode (single process)
let db = Database::open("./my_data")?;
// Concurrent mode (multi-reader, single-writer)
let db = Database::open_concurrent("./my_data")?;
// Begin a transaction
let txn = db.begin_transaction()?;
// Write data
db.put(txn, b"user:1:name", b"Alice")?;
// Commit
db.commit(txn)?;Implementations§
Source§impl Database
impl Database
Sourcepub const MIN_SCAN_PREFIX_LEN: usize = 2
pub const MIN_SCAN_PREFIX_LEN: usize = 2
Minimum prefix length for scan operations. Prevents expensive full-table scans by requiring a meaningful prefix.
Sourcepub fn open<P: AsRef<Path>>(path: P) -> Result<Arc<Self>>
pub fn open<P: AsRef<Path>>(path: P) -> Result<Arc<Self>>
Open or create a database at the given path.
This is the primary entry point, similar to sqlite3_open().
If the database exists, it will be opened and WAL recovery performed.
If it doesn’t exist, a new database will be created.
§Arguments
path- Directory path for the database files
§Returns
An Arc<Database> that can be shared across threads and connections.
Sourcepub fn open_with_config<P: AsRef<Path>>(
path: P,
config: DatabaseConfig,
) -> Result<Arc<Self>>
pub fn open_with_config<P: AsRef<Path>>( path: P, config: DatabaseConfig, ) -> Result<Arc<Self>>
Open with custom configuration
Sourcepub fn open_concurrent<P: AsRef<Path>>(path: P) -> Result<Arc<Self>>
pub fn open_concurrent<P: AsRef<Path>>(path: P) -> Result<Arc<Self>>
Open database in concurrent mode (multi-reader, single-writer)
This mode allows multiple processes to access the database simultaneously:
- Readers: Lock-free, concurrent access via MVCC snapshots
- Writers: Single-writer coordination through atomic locks
§Use Cases
- Web applications (Flask, FastAPI, Django)
- Hot reloading development servers
- Multi-process worker pools
- Any scenario with concurrent read access
§Performance
- Read latency: ~100ns (lock-free atomic operations)
- Write latency: ~60μs amortized (with group commit)
- Concurrent readers: Up to 1024 (configurable)
§Example
// Multiple processes can open the same database
let db = Database::open_concurrent("./my_data")?;
// Reads are lock-free
let value = db.get(b"key")?;
// Writes coordinate automatically
let txn = db.begin_transaction()?;
db.put(txn, b"key", b"value")?;
db.commit(txn)?;Sourcepub fn open_concurrent_with_config<P: AsRef<Path>>(
path: P,
config: DatabaseConfig,
) -> Result<Arc<Self>>
pub fn open_concurrent_with_config<P: AsRef<Path>>( path: P, config: DatabaseConfig, ) -> Result<Arc<Self>>
Open database in concurrent mode with custom configuration
Sourcepub fn is_concurrent(&self) -> bool
pub fn is_concurrent(&self) -> bool
Check if database is in concurrent mode
Sourcepub fn begin_transaction(&self) -> Result<TxnHandle>
pub fn begin_transaction(&self) -> Result<TxnHandle>
Begin a new transaction
Sourcepub fn begin_read_only(&self) -> Result<TxnHandle>
pub fn begin_read_only(&self) -> Result<TxnHandle>
Begin a read-only transaction (optimized: no SSI tracking)
Read-only transactions skip SSI read tracking, reducing overhead from ~82ns to ~32ns per read (2.6x faster).
Use this for:
- SELECT queries that don’t modify data
- Analytics and reporting queries
- Snapshot reads for backup
Sourcepub fn begin_read_only_fast(&self) -> TxnHandle
pub fn begin_read_only_fast(&self) -> TxnHandle
Begin a lightweight read-only transaction (no WAL overhead).
Eliminates WAL mutex acquisitions entirely for read operations. The txn_id is allocated atomically and MVCC snapshot state is created, but NO WAL records are written (no TxnBegin, no TxnAbort).
~5-10x faster per-operation than begin_read_only() because it avoids:
- 2 WAL mutex lock/unlock cycles per transaction
- 2 WAL BufWriter serializations per transaction
Callers MUST use abort_read_only_fast() to clean up — NOT commit()
or abort().
Sourcepub fn abort_read_only_fast(&self, txn: TxnHandle)
pub fn abort_read_only_fast(&self, txn: TxnHandle)
Abort a fast read-only transaction — O(1), no WAL, no memtable scan.
Sourcepub fn get_raw_read(&self, key: &[u8]) -> Option<Vec<u8>>
pub fn get_raw_read(&self, key: &[u8]) -> Option<Vec<u8>>
Read a key WITHOUT any MVCC transaction tracking.
Uses the current global timestamp to see all committed writes. Bypasses: begin/abort, active_txns DashMap, record_read, stats. Only safe for single-threaded access with no concurrent writers.
Sourcepub fn scan_raw(&self, prefix: &[u8]) -> Vec<(Vec<u8>, Vec<u8>)>
pub fn scan_raw(&self, prefix: &[u8]) -> Vec<(Vec<u8>, Vec<u8>)>
Scan by prefix WITHOUT any MVCC transaction tracking.
Uses the current global timestamp. Only safe for single-threaded access.
Sourcepub fn begin_write_only(&self) -> Result<TxnHandle>
pub fn begin_write_only(&self) -> Result<TxnHandle>
Begin a write-only transaction (optimized: no read tracking)
Write-only transactions skip read tracking, improving insert throughput for bulk loading scenarios.
Use this for:
- Bulk data imports
- Append-only logging tables
- ETL pipelines
Sourcepub fn commit(&self, txn: TxnHandle) -> Result<u64>
pub fn commit(&self, txn: TxnHandle) -> Result<u64>
Commit a transaction
In concurrent mode, acquires the shared writer lock to ensure WAL writes are serialized across processes, and forces a flush+sync so that subsequent processes see the committed data.
Sourcepub fn set_table_index_policy(&self, table: &str, policy: IndexPolicy)
pub fn set_table_index_policy(&self, table: &str, policy: IndexPolicy)
Configure index policy for a table
This allows fine-grained control over write/scan trade-offs per table:
| Policy | Insert Cost | Scan Cost | Use Case |
|---|---|---|---|
| WriteOptimized | O(1) | O(N) | High-write, rare scan |
| Balanced | O(1) amort | O(output+logK) | Mixed OLTP |
| ScanOptimized | O(log N) | O(logN + K) | Analytics, range query |
| AppendOnly | O(1) | O(N) | Time-series logs |
§Example
// Fast inserts for logs table (no ordered index overhead)
db.set_table_index_policy("logs", IndexPolicy::WriteOptimized);
// Efficient range scans for analytics table
db.set_table_index_policy("analytics", IndexPolicy::ScanOptimized);
// Balanced for OLTP tables
db.set_table_index_policy("users", IndexPolicy::Balanced);Sourcepub fn get_table_index_policy(&self, table: &str) -> IndexPolicy
pub fn get_table_index_policy(&self, table: &str) -> IndexPolicy
Get the index policy for a table
Sourcepub fn index_registry(&self) -> &Arc<TableIndexRegistry>
pub fn index_registry(&self) -> &Arc<TableIndexRegistry>
Get the index registry for advanced configuration
Sourcepub fn put(&self, txn: TxnHandle, key: &[u8], value: &[u8]) -> Result<()>
pub fn put(&self, txn: TxnHandle, key: &[u8], value: &[u8]) -> Result<()>
Put a key-value pair
In concurrent mode, acquires the shared writer lock to ensure WAL writes are serialized across processes.
Sourcepub fn put_batch(&self, txn: TxnHandle, writes: &[(&[u8], &[u8])]) -> Result<()>
pub fn put_batch(&self, txn: TxnHandle, writes: &[(&[u8], &[u8])]) -> Result<()>
Batch put multiple key-value pairs with reduced overhead
This amortizes per-operation costs over the entire batch:
- Single DashMap lookup
- Batch MVCC tracking
- Batch memtable writes
For 100+ entries, this is 2-3x faster than individual puts.
§Example
let writes: Vec<(&[u8], &[u8])> = vec![
(b"key1", b"value1"),
(b"key2", b"value2"),
(b"key3", b"value3"),
];
db.put_batch(txn, &writes)?;Sourcepub fn scan(
&self,
txn: TxnHandle,
prefix: &[u8],
) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
pub fn scan( &self, txn: TxnHandle, prefix: &[u8], ) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
Scan keys with a prefix (enforces minimum prefix length for safety).
§Prefix Safety
To prevent accidental full-table scans, this method requires a minimum
prefix length of 2 bytes. Use scan_unchecked for internal operations
that need empty/short prefixes.
§Errors
Returns SochDBError::InvalidInput if prefix is too short.
Sourcepub fn scan_unchecked(
&self,
txn: TxnHandle,
prefix: &[u8],
) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
pub fn scan_unchecked( &self, txn: TxnHandle, prefix: &[u8], ) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
Scan keys with a prefix without length validation.
§Warning
This method allows empty/short prefixes which can cause expensive
full-table scans. Use scan() unless you specifically need unrestricted
prefix access for internal operations.
Sourcepub fn scan_range(
&self,
txn: TxnHandle,
start: &[u8],
end: &[u8],
) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
pub fn scan_range( &self, txn: TxnHandle, start: &[u8], end: &[u8], ) -> Result<Vec<(Vec<u8>, Vec<u8>)>>
Scan keys in range
Sourcepub fn scan_range_iter<'a>(
&'a self,
txn: TxnHandle,
start: &'a [u8],
end: &'a [u8],
) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> + 'a
pub fn scan_range_iter<'a>( &'a self, txn: TxnHandle, start: &'a [u8], end: &'a [u8], ) -> impl Iterator<Item = Result<(Vec<u8>, Vec<u8>)>> + 'a
Streaming scan for very large result sets
Returns an iterator that yields (key, value) pairs without materializing the entire result set. Use this for large scans where memory efficiency is important.
§Performance
- Memory: O(1) per iteration vs O(N) for scan_range
- Latency: First result available immediately vs waiting for all results
- Throughput: Slightly lower due to per-item overhead
§Usage
for result in db.scan_range_iter(txn, b"start", b"end") {
let (key, value) = result?;
// Process immediately - no need to wait for all results
}Sourcepub fn storage_stats(&self) -> StorageStats
pub fn storage_stats(&self) -> StorageStats
Get storage statistics
Sourcepub fn put_path(&self, txn: TxnHandle, path: &str, value: &[u8]) -> Result<()>
pub fn put_path(&self, txn: TxnHandle, path: &str, value: &[u8]) -> Result<()>
Put a value at a path
Path format: “collection/doc_id/field” or “table.row_id.column” Resolution is O(|path|), not O(log N) like B-tree.
Sourcepub fn get_path(&self, txn: TxnHandle, path: &str) -> Result<Option<Vec<u8>>>
pub fn get_path(&self, txn: TxnHandle, path: &str) -> Result<Option<Vec<u8>>>
Get a value at a path
Sourcepub fn scan_path(
&self,
txn: TxnHandle,
prefix: &str,
) -> Result<Vec<(String, Vec<u8>)>>
pub fn scan_path( &self, txn: TxnHandle, prefix: &str, ) -> Result<Vec<(String, Vec<u8>)>>
Scan a path prefix
Returns all key-value pairs where key starts with prefix. Useful for: “users/123/” -> all fields of user 123
Sourcepub fn query(&self, txn: TxnHandle, path_prefix: &str) -> QueryBuilder<'_>
pub fn query(&self, txn: TxnHandle, path_prefix: &str) -> QueryBuilder<'_>
Execute a path query and return results
This is the main query interface for LLM context retrieval. Supports:
- Path prefix matching
- Column projection (for I/O reduction)
- Limit/offset
Sourcepub fn register_table(&self, schema: TableSchema) -> Result<()>
pub fn register_table(&self, schema: TableSchema) -> Result<()>
Register a table schema
Sourcepub fn get_table_schema(&self, name: &str) -> Option<TableSchema>
pub fn get_table_schema(&self, name: &str) -> Option<TableSchema>
Get table schema
Sourcepub fn update_table_schema(
&self,
old_name: &str,
schema: TableSchema,
) -> Result<()>
pub fn update_table_schema( &self, old_name: &str, schema: TableSchema, ) -> Result<()>
Update the schema for an existing table (used by ALTER TABLE).
Replaces the schema in both the tables DashMap and the packed schema
cache atomically (per-key). The caller is responsible for validating
the new schema.
Sourcepub fn list_tables(&self) -> Vec<String>
pub fn list_tables(&self) -> Vec<String>
List all tables
Sourcepub fn enable_cdc(&mut self, config: CdcConfig) -> Arc<CdcLog>
pub fn enable_cdc(&mut self, config: CdcConfig) -> Arc<CdcLog>
Enable CDC on this database, returning the CDC log handle.
Subsequent mutations emitted via the SQL execution layer will be recorded in the CDC log for subscriber consumption.
Sourcepub fn insert_row(
&self,
txn: TxnHandle,
table: &str,
row_id: u64,
values: &HashMap<String, SochValue>,
) -> Result<()>
pub fn insert_row( &self, txn: TxnHandle, table: &str, row_id: u64, values: &HashMap<String, SochValue>, ) -> Result<()>
Insert a row into a table
Uses packed row format: stores entire row as single key-value pair. This reduces write amplification from 4× to 1× for a 4-column table.
§Performance
- Before: 4 columns × (WAL entry + MVCC version) = 4 writes
- After: 1 packed row = 1 write
- Improvement: ~4× fewer WAL entries, ~48% less I/O overhead
Sourcepub fn read_row(
&self,
txn: TxnHandle,
table: &str,
row_id: u64,
columns: Option<&[&str]>,
) -> Result<Option<HashMap<String, SochValue>>>
pub fn read_row( &self, txn: TxnHandle, table: &str, row_id: u64, columns: Option<&[&str]>, ) -> Result<Option<HashMap<String, SochValue>>>
Read a row from a table
Reads packed row and extracts requested columns in O(k) time. Column projection happens in memory, not storage - all columns are fetched.
Sourcepub fn insert_rows_batch(
&self,
txn: TxnHandle,
table: &str,
rows: &[(u64, HashMap<String, SochValue>)],
) -> Result<usize>
pub fn insert_rows_batch( &self, txn: TxnHandle, table: &str, rows: &[(u64, HashMap<String, SochValue>)], ) -> Result<usize>
Insert multiple rows efficiently in a batch
This method accumulates all rows and writes them with fewer WAL syncs. Ideal for bulk loading scenarios.
§Performance
- Uses group commit to batch fsync operations
- Expected throughput: 500K-1M rows/sec depending on row size
Sourcepub fn put_raw(&self, txn: TxnHandle, key: &[u8], value: &[u8]) -> Result<()>
pub fn put_raw(&self, txn: TxnHandle, key: &[u8], value: &[u8]) -> Result<()>
Ultra-fast raw put - bypasses all validation
Use when you’ve already validated the data and just need speed. This is ~10× faster than insert_row() for bulk inserts.
Sourcepub fn insert_row_slice(
&self,
txn: TxnHandle,
table: &str,
row_id: u64,
values: &[Option<&SochValue>],
) -> Result<()>
pub fn insert_row_slice( &self, txn: TxnHandle, table: &str, row_id: u64, values: &[Option<&SochValue>], ) -> Result<()>
Zero-allocation insert - fastest path for bulk inserts
Takes values as a slice in schema column order, avoiding HashMap overhead.
§Arguments
txn- Transaction handletable- Table namerow_id- Row identifiervalues- Values in schema column order (None = NULL)
§Performance
- Eliminates ~6 allocations per row vs insert_row()
- Expected: 1.2M-1.5M inserts/sec
§Example
let values: &[Option<&SochValue>] = &[
Some(&SochValue::Int(1)),
Some(&SochValue::Text("Alice".into())),
None, // NULL
];
db.insert_row_slice(txn, "users", 1, values)?;Sourcepub fn checkpoint(&self) -> Result<u64>
pub fn checkpoint(&self) -> Result<u64>
Create a checkpoint
Sourcepub fn truncate_wal(&self) -> Result<()>
pub fn truncate_wal(&self) -> Result<()>
Truncate the WAL file after a checkpoint.
See DurableStorage::truncate_wal for safety notes.
Trait Implementations§
Auto Trait Implementations§
impl !Freeze for Database
impl !RefUnwindSafe for Database
impl !UnwindSafe for Database
impl Send for Database
impl Sync for Database
impl Unpin for Database
impl UnsafeUnpin for Database
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more