sombra 0.3.3

High-performance graph database with ACID transactions, single-file storage, and bindings for Rust, TypeScript, and Python
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
# Operations Guide

This guide covers operational aspects of running Sombra, including monitoring, maintenance, and troubleshooting.

## Monitoring

### Health Checks

Sombra provides built-in health monitoring:

```rust
use sombra::GraphDB;

let db = GraphDB::open("production.db")?;
let health = db.health_check()?;

println!("Health status: {:?}", health.status);

for check in &health.checks {
    println!("Check: {:?}", check);
}
```

The health check evaluates:
- Cache hit rate
- WAL size
- Corruption errors
- Time since last checkpoint

### Performance Metrics

Access performance metrics:

```rust
use sombra::GraphDB;

let db = GraphDB::open("production.db")?;
let metrics = db.metrics.lock().unwrap();

println!("Cache hit rate: {:.2}%", metrics.cache_hit_rate() * 100.0);
println!("Transactions committed: {}", metrics.transactions_committed);
println!("Node lookups: {}", metrics.node_lookups);
println!("Edge traversals: {}", metrics.edge_traversals);

if let Some(p99) = metrics.p99_commit_latency() {
    println!("P99 commit latency: {}ms", p99);
}
```

Available metrics include:
- **Cache metrics**: hits, misses, hit rate
- **Index metrics**: label index queries, property index hits/misses
- **Transaction metrics**: commits, rollbacks, latencies (P50/P95/P99)
- **WAL metrics**: syncs, bytes written
- **Checkpoint metrics**: checkpoints performed
- **Compaction metrics**: compactions, pages compacted, bytes reclaimed

### Metrics Export

Export metrics to monitoring systems:

```rust
let metrics = db.metrics.lock().unwrap();

println!("{}", metrics.to_prometheus_format());

let json = metrics.to_json()?;
println!("{}", json);

let statsd_metrics = metrics.to_statsd("sombra");
for metric in statsd_metrics {
    println!("{}", metric);
}
```

### Structured Logging

Configure logging for monitoring:

```rust
use sombra::logging;

logging::init_logging("info")?;
```

## Backup and Restore

Sombra uses WAL (Write-Ahead Logging) for durability. To backup a database:

1. Stop all write operations or use a read-only connection
2. Call `checkpoint()` to flush WAL to main database file
3. Copy the `.db` and `.db-wal` files

```bash
# Simple backup script
DATE=$(date +%Y%m%d_%H%M%S)
cp production.db "backups/backup_$DATE.db"
cp production.db-wal "backups/backup_$DATE.db-wal"
```

WAL recovery happens automatically on `GraphDB::open()` if the database was not cleanly shut down.

## Database Maintenance

### Checkpoint Management

Manually trigger WAL checkpoints:

```rust
let mut db = GraphDB::open("production.db")?;

db.checkpoint()?;
```

Checkpoints flush WAL entries to the main database file. They are triggered automatically based on configuration settings:
- `checkpoint_threshold`: Number of WAL frames before auto-checkpoint (default: 1000)

### Database Integrity

Verify database integrity:

```rust
use sombra::{GraphDB, IntegrityOptions};

let db = GraphDB::open("production.db")?;

let options = IntegrityOptions::default();
let report = db.verify_integrity(&options)?;

println!("Checked {} pages", report.checked_pages);
println!("Checksum failures: {}", report.checksum_failures);
println!("Record errors: {}", report.record_errors);
println!("Index errors: {}", report.index_errors);

for error in &report.errors {
    println!("Error: {}", error);
}
```

Integrity checking options:
- `checksum_only`: Only verify page checksums, skip record validation
- `max_errors`: Maximum errors to collect before stopping (default: 16)
- `verify_indexes`: Verify that indexes match actual data (default: true)
- `verify_adjacency`: Verify edge references point to valid nodes (default: true)

### Configuration Tuning

See the [Configuration Guide](configuration.md) for tuning:
- Cache size
- WAL sync mode
- Checkpoint threshold
- Memory-mapped I/O
- Compaction settings

## Troubleshooting

### High Memory Usage

**Symptoms:**
- Process using more memory than expected
- OOM errors

**Diagnosis:**
```rust
let metrics = db.metrics.lock().unwrap();
println!("Cache hit rate: {:.2}%", metrics.cache_hit_rate() * 100.0);
println!("Page evictions: {}", metrics.page_evictions);
```

**Solutions:**
- Reduce `page_cache_size` in config
- Ensure checkpoint is running regularly

### Slow Performance

**Symptoms:**
- High query latency
- Low throughput

**Diagnosis:**
```rust
let metrics = db.metrics.lock().unwrap();
println!("Cache hit rate: {:.2}%", metrics.cache_hit_rate() * 100.0);

if let Some(p99) = metrics.p99_commit_latency() {
    println!("P99 commit latency: {}ms", p99);
}
```

**Solutions:**
- Increase `page_cache_size` if cache hit rate is low (< 90%)
- Use appropriate WAL `sync_mode` for your durability requirements
- Enable `use_mmap` for read-heavy workloads
- Checkpoint regularly to prevent large WAL files

### Database Corruption

**Symptoms:**
- Corruption errors in logs
- Crashes on read/write
- Checksum failures

**Diagnosis:**
```rust
use sombra::{GraphDB, IntegrityOptions};

let db = GraphDB::open("production.db")?;
let report = db.verify_integrity(&IntegrityOptions::default())?;

if report.checksum_failures > 0 || report.record_errors > 0 {
    println!("Corruption detected:");
    for error in &report.errors {
        println!("  {}", error);
    }
}
```

**Solutions:**
1. Restore from backup if available
2. WAL recovery may fix some issues automatically on restart
3. Check hardware (disk errors, memory issues)
4. Review logs for patterns before corruption occurred

## Range Queries and Ordered Access

### Node Range Queries

Sombra provides efficient range queries using the BTreeMap-based node index:

```rust
use sombra::GraphDB;

let db = GraphDB::open("production.db")?;

let node_ids = db.get_nodes_in_range(100, 200);
println!("Found {} nodes between IDs 100 and 200", node_ids.len());

let node_ids = db.get_nodes_from(1000);
println!("Found {} nodes with ID >= 1000", node_ids.len());

let node_ids = db.get_nodes_to(500);
println!("Found {} nodes with ID <= 500", node_ids.len());
```

### Ordered Node Access

Access nodes in sorted order by their IDs:

```rust
if let Some(first_id) = db.get_first_node() {
    let node = db.get_node(first_id)?;
    println!("First node: {:?}", node);
}

if let Some(last_id) = db.get_last_node() {
    let node = db.get_node(last_id)?;
    println!("Last node: {:?}", node);
}

let first_100 = db.get_first_n_nodes(100);
println!("First 100 node IDs: {:?}", first_100);

let last_100 = db.get_last_n_nodes(100);
println!("Last 100 node IDs: {:?}", last_100);

let all_ids = db.get_all_node_ids_ordered();
println!("Total nodes: {}", all_ids.len());
```

### Use Cases for Range Queries

**Pagination:**
```rust
let page_size = 100;
let page_number = 5;

let all_ids = db.get_all_node_ids_ordered();
let start = page_number * page_size;
let page_ids = &all_ids[start..std::cmp::min(start + page_size, all_ids.len())];

for &node_id in page_ids {
    let node = db.get_node(node_id)?;
    println!("{:?}", node);
}
```

**Timeline Views:**
```rust
let recent_ids = db.get_last_n_nodes(50);
for &node_id in &recent_ids {
    let node = db.get_node(node_id)?;
    println!("Recent: {:?}", node);
}
```

**Batch Processing:**
```rust
let chunk_size = 1000;
let all_ids = db.get_all_node_ids_ordered();

for chunk in all_ids.chunks(chunk_size) {
    for &node_id in chunk {
        let node = db.get_node(node_id)?;
    }
    
    db.checkpoint()?;
}
```

### Range Queries in Transactions

Range queries work in transactions:

```rust
let mut tx = db.begin_transaction()?;

let node_ids = tx.get_nodes_in_range(100, 200);

for &node_id in &node_ids {
    tx.set_node_property(
        node_id,
        "processed".to_string(),
        PropertyValue::Bool(true)
    )?;
}

tx.commit()?;
```

### Performance Characteristics

Range queries leverage the BTreeMap index for optimal performance:

- **Point lookup**: O(log n) - ~440ns for 10K nodes
- **Range scan**: O(log n + k) - where k is result size
- **Full iteration**: O(n) - ~2.6ns per node
- **First/Last N**: O(log n + k) - < 1µs for N=100

## Property Updates

### Updating Node Properties

Modify node properties using `set_node_property`:

```rust
use sombra::{GraphDB, PropertyValue};

let mut db = GraphDB::open("production.db")?;

db.set_node_property(
    node_id,
    "status".to_string(),
    PropertyValue::String("active".to_string())
)?;

db.set_node_property(node_id, "count".to_string(), PropertyValue::Int(42))?;
db.set_node_property(node_id, "verified".to_string(), PropertyValue::Bool(true))?;
```

### Removing Node Properties

Remove properties from nodes:

```rust
db.remove_node_property(node_id, "temporary_flag")?;
```

### Property Updates in Transactions

Property updates within transactions:

```rust
let mut tx = db.begin_transaction()?;

tx.set_node_property(node_id, "counter".to_string(), PropertyValue::Int(42))?;
tx.remove_node_property(node_id, "old_field")?;

tx.commit()?;
```

### Performance Characteristics

Property updates use **update-in-place** optimization when possible:
- **In-place update**: When the new record fits in existing space, only one page write occurs
- **Fallback to reinsert**: When the record grows, the system falls back to delete+reinsert
- **Automatic index updates**: Property indexes are updated atomically with the property change

## Monitoring Integration

### Prometheus Metrics Exporter

Example Prometheus exporter:

```rust
use sombra::GraphDB;
use std::time::Duration;
use std::thread;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let db = GraphDB::open("production.db")?;
    
    loop {
        let metrics = db.metrics.lock().unwrap();
        println!("{}", metrics.to_prometheus_format());
        
        drop(metrics);
        thread::sleep(Duration::from_secs(60));
    }
}
```

### JSON Metrics API

For custom monitoring dashboards:

```rust
use sombra::GraphDB;
use std::fs::File;
use std::io::Write;

let db = GraphDB::open("production.db")?;
let metrics = db.metrics.lock().unwrap();

let json = metrics.to_json()?;
let mut file = File::create("metrics.json")?;
file.write_all(json.as_bytes())?;
```

## Next Steps

- Read the [Configuration Guide]configuration.md for performance tuning
- Check the [Getting Started Guide]getting-started.md for basic usage
- Review the [examples]../examples/ for operational patterns