iridium-db 0.2.0

A high-performance vector-graph hybrid storage and indexing engine
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
# storage

`src/features/storage/api.rs`

The LSM storage engine. All reads and writes to graph data, edge deltas, vector deltas, and bitmap indexes go through this module.

**SSTable format version:** `IRSTBL02`. Each entry includes a trailing 4-byte CRC32C checksum covering `key | version | kind | value_len | value`. Files written by earlier versions (`IRSTBL01`) will be rejected with `CorruptData("invalid magic")` on open.

---

## Opening a Store

```rust
pub fn open_store(config: StorageConfig) -> Result<StorageHandle>
```
Opens (or creates) a store at the paths specified in `config`. Acquires a data-directory lock — only one `StorageHandle` may be open per data directory at a time.

```rust
pub fn open_store_with_reactor(
    config: StorageConfig,
    reactor: Arc<dyn Reactor + Send + Sync>,
) -> Result<StorageHandle>
```
Same as `open_store` but injects a custom `Reactor`. Use with `DeterministicReactor` in tests.

```rust
pub fn open_store_for_request(
    config: StorageConfig,
    request: &ThreadCoreRequest,
    lanes: &ThreadCoreLaneConfig,
) -> Result<StorageHandle>
```
Opens a per-core partitioned store. When `lanes.partition_wal` or `lanes.partition_sstable` is true, the WAL and SSTable directories are scoped to `core-{shard:04}` subdirectories. Used for thread-per-core deployments.

---

## Configuration

```rust
pub struct StorageConfig {
    pub buffer_pool_pages: usize,
    pub wal_dir: PathBuf,
    pub wal_segment_max_bytes: u64,
    pub manifest_path: PathBuf,
    pub sstable_dir: PathBuf,
}
```

```rust
pub struct ThreadCoreLaneConfig {
    pub partition_wal: bool,      // default: true
    pub partition_sstable: bool,  // default: true
}
```

---

## StorageHandle

The central mutable state of the engine. Not `Clone`; pass as `&mut` to all storage functions.

```rust
pub struct StorageHandle {
    pub buffer_pool: BufferPool,
    pub wal: Wal,
    pub manifest: Manifest,
    pub bitmap_store: BitmapStore,
    pub memtable: MemTable,
    pub l0_runs: Vec<PathBuf>,
    pub sstable_cache: HashMap<PathBuf, Sstable>,
    pub sstable_dir: PathBuf,
    pub metrics: AmpMetrics,
    pub reactor: Arc<dyn Reactor + Send + Sync>,
    pub compaction_policy: CompactionPolicy,
    pub hnsw_scheduler: HnswMaintenanceScheduler,
    pub hnsw_graph: HnswGraph,          // in-memory ANN index
    pub hnsw_total_vectors: u64,
    pub hnsw_updated_vectors: u64,
    pub last_hnsw_rebuild_reason: Option<String>,
    pub pending_deltas_per_node: HashMap<u64, u32>,
    // ... (internal cache fields)
}
```

---

## Read Operations

```rust
pub fn get_logical_node(handle: &mut StorageHandle, node_id: u64) -> Result<LogicalNode>
```
Returns the merged view of a node: its latest `FullNode` entry (if any) plus all accumulated `EdgeDelta` entries. Checks the logical node cache first, then MemTable, then L0 SSTables in recency order.

```rust
pub fn get_logical_node_for_request(
    handle: &mut StorageHandle,
    node_id: u64,
    request: &ThreadCoreRequest,
) -> Result<LogicalNode>
```
Same as `get_logical_node` but asserts that `request` owns `node_id` (debug builds only).

```rust
pub fn get_node_row_summary(handle: &StorageHandle, node_id: u64) -> Result<NodeRowSummary>
```
Returns lightweight presence metadata for a node without decoding full payloads.

```rust
pub fn get_node_row_summary_for_request(
    handle: &StorageHandle,
    node_id: u64,
    request: &ThreadCoreRequest,
) -> Result<NodeRowSummary>
```

---

## Write Operations

```rust
pub fn put_full_node(
    handle: &mut StorageHandle,
    node_id: u64,
    version: u64,
    adjacency: &[u64],
) -> Result<()>
```
Writes a `FullNode` entry (node ID + adjacency list). Appends to WAL, then inserts into MemTable. `version` must be > 0.

```rust
pub fn put_edge_delta(handle: &mut StorageHandle, delta: &[u8]) -> Result<()>
```
Writes a single pre-encoded `EdgeDelta` entry.

```rust
pub fn put_edge_deltas_batch(handle: &mut StorageHandle, deltas: &[Vec<u8>]) -> Result<()>
```
Writes a batch of pre-encoded `EdgeDelta` entries in a single WAL append. Prefer this over repeated `put_edge_delta` calls.

```rust
pub fn put_vector_delta(handle: &mut StorageHandle, delta: &[u8]) -> Result<()>
```
Writes a `VectorDelta` entry.

Current contract:
- Canonical write payloads should be produced with:

```rust
pub fn encode_vector_payload_f32(
    space_id: u32,
    metric: VectorMetric,
    values: &[f32],
    normalized: bool,
) -> Vec<u8>
```

- Structured `quantized_i8` payloads can be produced with:

```rust
pub fn encode_vector_payload_quantized_i8(
    space_id: u32,
    metric: VectorMetric,
    values: &[f32],
    normalized: bool,
) -> Result<Vec<u8>, String>
```

- Structured payloads carry `space_id`, `dimension`, `encoding`, `metric`, `normalized`, and `norm`, followed by packed vector values.
- `quantized_i8` bodies store a per-vector `f32` scale followed by signed `i8` values; runtime decode/dequantize happens in Rust before scoring.
- Legacy raw packed-`f32` payloads remain read-compatible only while manifest compatibility is enabled, and should not be used for new writes.
- On write, structured payload descriptors are registered/validated against manifest vector-space metadata.
- Cosine payloads from ANN-eligible registered spaces are inserted into that space's in-memory HNSW graph.

```rust
pub fn encode_delta(node_id: u64, version: u64, payload: &[u8]) -> Vec<u8>
```
Encodes a delta payload into the wire format expected by `put_edge_delta` / `put_vector_delta`.

```rust
pub fn encode_adjacency(adjacency: &[u64]) -> Vec<u8>
```
Encodes an adjacency list into the wire format expected by `put_full_node`.

---

## Bitmap Index Operations

```rust
pub fn create_bitmap_index(handle: &mut StorageHandle, index_name: &str) -> Result<()>
```
Creates a named roaring bitmap index. No-op if the index already exists.

```rust
pub fn bitmap_add_posting(
    handle: &mut StorageHandle,
    index_name: &str,
    value_key: &str,
    node_id: u64,
) -> Result<()>
```
Adds `node_id` to the posting list for `value_key` within `index_name`.

```rust
pub fn bitmap_postings(
    handle: &StorageHandle,
    index_name: &str,
    value_key: &str,
) -> Result<Vec<u64>>
```
Returns all node IDs in the posting list for `(index_name, value_key)`.

```rust
pub fn bitmap_postings_in_range_limit(
    handle: &StorageHandle,
    index_name: &str,
    value_key: &str,
    min_node_id: u64,
    max_node_id_exclusive: u64,
    limit: usize,
) -> Result<Vec<u64>>
```
Returns up to `limit` node IDs in `[min_node_id, max_node_id_exclusive)`.

```rust
pub fn bitmap_postings_in_range_limit_for_request(
    handle: &StorageHandle,
    index_name: &str,
    value_key: &str,
    min_node_id: u64,
    max_node_id_exclusive: u64,
    limit: usize,
    request: &ThreadCoreRequest,
) -> Result<Vec<u64>>
```

```rust
pub fn list_bitmap_indexes(handle: &StorageHandle) -> Result<Vec<String>>
```
Returns the names of all registered bitmap indexes.

---

## HNSW Vector Index

The in-memory HNSW index is maintained per ANN-eligible vector space. Compatible cosine `VectorDelta` SSTable entries are rebuilt into the matching space graph on `open_store`, and live writes update that same per-space graph.

```rust
pub fn hnsw_search(handle: &StorageHandle, query: &[f32], k: usize) -> Vec<(u64, f64)>
```
Returns the top-k approximate nearest neighbors to `query` using the compatibility/default HNSW view. Result pairs are `(node_id, cosine_similarity)` sorted by similarity descending. Returns an empty vec if the compatibility view is empty.

```rust
pub fn hnsw_search_in_space(
    handle: &StorageHandle,
    space_id: u32,
    query: &[f32],
    k: usize,
) -> Vec<(u64, f64)>
```
Returns the top-k approximate nearest neighbors within one explicit vector space. If that space has no ANN graph, returns an empty vec.

```rust
pub fn hnsw_insert(handle: &mut StorageHandle, node_id: u64, vector: Vec<f32>)
```
Inserts a vector into the compatibility/default HNSW view and updates maintenance counters.

```rust
pub fn hnsw_insert_for_space(
    handle: &mut StorageHandle,
    space_id: u32,
    node_id: u64,
    vector: Vec<f32>,
)
```
Inserts a vector into one explicit space graph and updates HNSW maintenance counters. Normally called indirectly via `put_vector_delta` or WAL recovery.

```rust
pub fn ann_space_for_query(
    handle: &StorageHandle,
    metric: VectorMetric,
    requested_dim: Option<usize>,
) -> Option<u32>
```
Returns the unique ANN-eligible space for a query when metric and dimension constraints identify exactly one compatible space. Otherwise runtime falls back to scan/rerank.

Graph parameters (fixed at open time): `m=16`, `m0=32`, `ef_construction=200`, cosine distance.

---

## Durability

```rust
pub fn recover_from_wal(handle: &mut StorageHandle) -> Result<()>
```
Replays WAL records into the MemTable. Call on startup after `open_store` if crash recovery is needed.

```rust
pub fn sync(handle: &mut StorageHandle) -> Result<()>
```
Flushes the MemTable to an L0 SSTable and syncs the WAL to disk.

```rust
pub fn flush(handle: &mut StorageHandle) -> Result<()>
```
Flushes the MemTable to an L0 SSTable without an explicit WAL sync.

---

## Compaction

```rust
pub fn compact(handle: &mut StorageHandle) -> Result<()>
```
Runs one round of compaction according to the handle's `CompactionPolicy`. Merges L0 runs and promotes data to higher levels. Idempotent if no compaction is needed.

```rust
pub fn compaction_job_status(handle: &StorageHandle) -> Result<JobStatus>
```
Returns the status of the most-recently-submitted compaction job.

```rust
pub fn compaction_jobs_snapshot(handle: &StorageHandle) -> Vec<BackgroundJobRecord>
```
Returns all tracked compaction job records (queued, running, and terminal).

```rust
pub fn latest_compaction_job_id(handle: &StorageHandle) -> Option<u64>
```
Returns the job ID of the last submitted compaction job.

---

## Metrics

```rust
pub fn report_metrics(handle: &StorageHandle) -> AmpReport
```
Returns write amplification, read amplification, and space amplification ratios computed from `handle.metrics`.

```rust
pub struct AmpMetrics {
    pub logical_bytes_written: u64,
    pub wal_bytes_written: u64,
    pub sstable_bytes_written: u64,
    pub sstable_bytes_read: u64,
    pub logical_bytes_read: u64,
}

pub struct AmpReport {
    pub write_amp: Option<f64>,   // (wal_written + sstable_written) / logical_written
    pub read_amp: Option<f64>,    // sstable_read / logical_read
    pub space_amp: Option<f64>,   // sstable_written / logical_written
}
```

---

## Shard Routing (re-exports from topology)

```rust
pub type CoreId = u16;
pub type ShardId = u16;
pub type ThreadCoreRequest = topology::ThreadCoreRequest;

pub fn shard_for_node(node_id: u64, shard_count: u16) -> ShardId
pub fn request_owns_node(request: &ThreadCoreRequest, node_id: u64) -> bool
```

---

## Types

```rust
pub struct LogicalNode {
    pub node_id: u64,
    pub full: Option<sstable::Entry>,   // Latest FullNode entry, if any
    pub deltas: Vec<sstable::Entry>,    // Accumulated EdgeDelta entries
}

impl LogicalNode {
    pub fn adjacency(&self) -> Vec<u64>  // Decoded adjacency list from full entry
}

pub struct NodeRowSummary {
    pub has_full: bool,
    pub delta_count: usize,
    pub adjacency_degree: usize,
}
```

---

## Errors

```rust
pub enum StorageError {
    Io(std::io::Error),
    InvalidInput(String),
    CorruptData(String),
    Sstable(String),
}

pub type Result<T> = std::result::Result<T, StorageError>;
```