# storage
`src/features/storage/api.rs`
The LSM storage engine. All reads and writes to graph data, edge deltas, vector deltas, and bitmap indexes go through this module.
**SSTable format version:** `IRSTBL02`. Each entry includes a trailing 4-byte CRC32C checksum covering `key | version | kind | value_len | value`. Files written by earlier versions (`IRSTBL01`) will be rejected with `CorruptData("invalid magic")` on open.
---
## Opening a Store
```rust
pub fn open_store(config: StorageConfig) -> Result<StorageHandle>
```
Opens (or creates) a store at the paths specified in `config`. Acquires a data-directory lock — only one `StorageHandle` may be open per data directory at a time.
```rust
pub fn open_store_with_reactor(
config: StorageConfig,
reactor: Arc<dyn Reactor + Send + Sync>,
) -> Result<StorageHandle>
```
Same as `open_store` but injects a custom `Reactor`. Use with `DeterministicReactor` in tests.
```rust
pub fn open_store_for_request(
config: StorageConfig,
request: &ThreadCoreRequest,
lanes: &ThreadCoreLaneConfig,
) -> Result<StorageHandle>
```
Opens a per-core partitioned store. When `lanes.partition_wal` or `lanes.partition_sstable` is true, the WAL and SSTable directories are scoped to `core-{shard:04}` subdirectories. Used for thread-per-core deployments.
---
## Configuration
```rust
pub struct StorageConfig {
pub buffer_pool_pages: usize,
pub wal_dir: PathBuf,
pub wal_segment_max_bytes: u64,
pub manifest_path: PathBuf,
pub sstable_dir: PathBuf,
}
```
```rust
pub struct ThreadCoreLaneConfig {
pub partition_wal: bool, // default: true
pub partition_sstable: bool, // default: true
}
```
---
## StorageHandle
The central mutable state of the engine. Not `Clone`; pass as `&mut` to all storage functions.
```rust
pub struct StorageHandle {
pub buffer_pool: BufferPool,
pub wal: Wal,
pub manifest: Manifest,
pub bitmap_store: BitmapStore,
pub memtable: MemTable,
pub l0_runs: Vec<PathBuf>,
pub sstable_cache: HashMap<PathBuf, Sstable>,
pub sstable_dir: PathBuf,
pub metrics: AmpMetrics,
pub reactor: Arc<dyn Reactor + Send + Sync>,
pub compaction_policy: CompactionPolicy,
pub hnsw_scheduler: HnswMaintenanceScheduler,
pub hnsw_graph: HnswGraph, // in-memory ANN index
pub hnsw_total_vectors: u64,
pub hnsw_updated_vectors: u64,
pub last_hnsw_rebuild_reason: Option<String>,
pub pending_deltas_per_node: HashMap<u64, u32>,
// ... (internal cache fields)
}
```
---
## Read Operations
```rust
pub fn get_logical_node(handle: &mut StorageHandle, node_id: u64) -> Result<LogicalNode>
```
Returns the merged view of a node: its latest `FullNode` entry (if any) plus all accumulated `EdgeDelta` entries. Checks the logical node cache first, then MemTable, then L0 SSTables in recency order.
```rust
pub fn get_logical_node_for_request(
handle: &mut StorageHandle,
node_id: u64,
request: &ThreadCoreRequest,
) -> Result<LogicalNode>
```
Same as `get_logical_node` but asserts that `request` owns `node_id` (debug builds only).
```rust
pub fn get_node_row_summary(handle: &StorageHandle, node_id: u64) -> Result<NodeRowSummary>
```
Returns lightweight presence metadata for a node without decoding full payloads.
```rust
pub fn get_node_row_summary_for_request(
handle: &StorageHandle,
node_id: u64,
request: &ThreadCoreRequest,
) -> Result<NodeRowSummary>
```
---
## Write Operations
```rust
pub fn put_full_node(
handle: &mut StorageHandle,
node_id: u64,
version: u64,
adjacency: &[u64],
) -> Result<()>
```
Writes a `FullNode` entry (node ID + adjacency list). Appends to WAL, then inserts into MemTable. `version` must be > 0.
```rust
pub fn put_edge_delta(handle: &mut StorageHandle, delta: &[u8]) -> Result<()>
```
Writes a single pre-encoded `EdgeDelta` entry.
```rust
pub fn put_edge_deltas_batch(handle: &mut StorageHandle, deltas: &[Vec<u8>]) -> Result<()>
```
Writes a batch of pre-encoded `EdgeDelta` entries in a single WAL append. Prefer this over repeated `put_edge_delta` calls.
```rust
pub fn put_vector_delta(handle: &mut StorageHandle, delta: &[u8]) -> Result<()>
```
Writes a `VectorDelta` entry.
Current contract:
- Canonical write payloads should be produced with:
```rust
pub fn encode_vector_payload_f32(
space_id: u32,
metric: VectorMetric,
values: &[f32],
normalized: bool,
) -> Vec<u8>
```
- Structured `quantized_i8` payloads can be produced with:
```rust
pub fn encode_vector_payload_quantized_i8(
space_id: u32,
metric: VectorMetric,
values: &[f32],
normalized: bool,
) -> Result<Vec<u8>, String>
```
- Structured payloads carry `space_id`, `dimension`, `encoding`, `metric`, `normalized`, and `norm`, followed by packed vector values.
- `quantized_i8` bodies store a per-vector `f32` scale followed by signed `i8` values; runtime decode/dequantize happens in Rust before scoring.
- Legacy raw packed-`f32` payloads remain read-compatible only while manifest compatibility is enabled, and should not be used for new writes.
- On write, structured payload descriptors are registered/validated against manifest vector-space metadata.
- Cosine payloads from ANN-eligible registered spaces are inserted into that space's in-memory HNSW graph.
```rust
pub fn encode_delta(node_id: u64, version: u64, payload: &[u8]) -> Vec<u8>
```
Encodes a delta payload into the wire format expected by `put_edge_delta` / `put_vector_delta`.
```rust
pub fn encode_adjacency(adjacency: &[u64]) -> Vec<u8>
```
Encodes an adjacency list into the wire format expected by `put_full_node`.
---
## Bitmap Index Operations
```rust
pub fn create_bitmap_index(handle: &mut StorageHandle, index_name: &str) -> Result<()>
```
Creates a named roaring bitmap index. No-op if the index already exists.
```rust
pub fn bitmap_add_posting(
handle: &mut StorageHandle,
index_name: &str,
value_key: &str,
node_id: u64,
) -> Result<()>
```
Adds `node_id` to the posting list for `value_key` within `index_name`.
```rust
pub fn bitmap_postings(
handle: &StorageHandle,
index_name: &str,
value_key: &str,
) -> Result<Vec<u64>>
```
Returns all node IDs in the posting list for `(index_name, value_key)`.
```rust
pub fn bitmap_postings_in_range_limit(
handle: &StorageHandle,
index_name: &str,
value_key: &str,
min_node_id: u64,
max_node_id_exclusive: u64,
limit: usize,
) -> Result<Vec<u64>>
```
Returns up to `limit` node IDs in `[min_node_id, max_node_id_exclusive)`.
```rust
pub fn bitmap_postings_in_range_limit_for_request(
handle: &StorageHandle,
index_name: &str,
value_key: &str,
min_node_id: u64,
max_node_id_exclusive: u64,
limit: usize,
request: &ThreadCoreRequest,
) -> Result<Vec<u64>>
```
```rust
pub fn list_bitmap_indexes(handle: &StorageHandle) -> Result<Vec<String>>
```
Returns the names of all registered bitmap indexes.
---
## HNSW Vector Index
The in-memory HNSW index is maintained per ANN-eligible vector space. Compatible cosine `VectorDelta` SSTable entries are rebuilt into the matching space graph on `open_store`, and live writes update that same per-space graph.
```rust
pub fn hnsw_search(handle: &StorageHandle, query: &[f32], k: usize) -> Vec<(u64, f64)>
```
Returns the top-k approximate nearest neighbors to `query` using the compatibility/default HNSW view. Result pairs are `(node_id, cosine_similarity)` sorted by similarity descending. Returns an empty vec if the compatibility view is empty.
```rust
pub fn hnsw_search_in_space(
handle: &StorageHandle,
space_id: u32,
query: &[f32],
k: usize,
) -> Vec<(u64, f64)>
```
Returns the top-k approximate nearest neighbors within one explicit vector space. If that space has no ANN graph, returns an empty vec.
```rust
pub fn hnsw_insert(handle: &mut StorageHandle, node_id: u64, vector: Vec<f32>)
```
Inserts a vector into the compatibility/default HNSW view and updates maintenance counters.
```rust
pub fn hnsw_insert_for_space(
handle: &mut StorageHandle,
space_id: u32,
node_id: u64,
vector: Vec<f32>,
)
```
Inserts a vector into one explicit space graph and updates HNSW maintenance counters. Normally called indirectly via `put_vector_delta` or WAL recovery.
```rust
pub fn ann_space_for_query(
handle: &StorageHandle,
metric: VectorMetric,
requested_dim: Option<usize>,
) -> Option<u32>
```
Returns the unique ANN-eligible space for a query when metric and dimension constraints identify exactly one compatible space. Otherwise runtime falls back to scan/rerank.
Graph parameters (fixed at open time): `m=16`, `m0=32`, `ef_construction=200`, cosine distance.
---
## Durability
```rust
pub fn recover_from_wal(handle: &mut StorageHandle) -> Result<()>
```
Replays WAL records into the MemTable. Call on startup after `open_store` if crash recovery is needed.
```rust
pub fn sync(handle: &mut StorageHandle) -> Result<()>
```
Flushes the MemTable to an L0 SSTable and syncs the WAL to disk.
```rust
pub fn flush(handle: &mut StorageHandle) -> Result<()>
```
Flushes the MemTable to an L0 SSTable without an explicit WAL sync.
---
## Compaction
```rust
pub fn compact(handle: &mut StorageHandle) -> Result<()>
```
Runs one round of compaction according to the handle's `CompactionPolicy`. Merges L0 runs and promotes data to higher levels. Idempotent if no compaction is needed.
```rust
pub fn compaction_job_status(handle: &StorageHandle) -> Result<JobStatus>
```
Returns the status of the most-recently-submitted compaction job.
```rust
pub fn compaction_jobs_snapshot(handle: &StorageHandle) -> Vec<BackgroundJobRecord>
```
Returns all tracked compaction job records (queued, running, and terminal).
```rust
pub fn latest_compaction_job_id(handle: &StorageHandle) -> Option<u64>
```
Returns the job ID of the last submitted compaction job.
---
## Metrics
```rust
pub fn report_metrics(handle: &StorageHandle) -> AmpReport
```
Returns write amplification, read amplification, and space amplification ratios computed from `handle.metrics`.
```rust
pub struct AmpMetrics {
pub logical_bytes_written: u64,
pub wal_bytes_written: u64,
pub sstable_bytes_written: u64,
pub sstable_bytes_read: u64,
pub logical_bytes_read: u64,
}
pub struct AmpReport {
pub write_amp: Option<f64>, // (wal_written + sstable_written) / logical_written
pub read_amp: Option<f64>, // sstable_read / logical_read
pub space_amp: Option<f64>, // sstable_written / logical_written
}
```
---
## Shard Routing (re-exports from topology)
```rust
pub type CoreId = u16;
pub type ShardId = u16;
pub type ThreadCoreRequest = topology::ThreadCoreRequest;
pub fn shard_for_node(node_id: u64, shard_count: u16) -> ShardId
pub fn request_owns_node(request: &ThreadCoreRequest, node_id: u64) -> bool
```
---
## Types
```rust
pub struct LogicalNode {
pub node_id: u64,
pub full: Option<sstable::Entry>, // Latest FullNode entry, if any
pub deltas: Vec<sstable::Entry>, // Accumulated EdgeDelta entries
}
impl LogicalNode {
pub fn adjacency(&self) -> Vec<u64> // Decoded adjacency list from full entry
}
pub struct NodeRowSummary {
pub has_full: bool,
pub delta_count: usize,
pub adjacency_degree: usize,
}
```
---
## Errors
```rust
pub enum StorageError {
Io(std::io::Error),
InvalidInput(String),
CorruptData(String),
Sstable(String),
}
pub type Result<T> = std::result::Result<T, StorageError>;
```