โก Kira KV Engine
KV-storage engine based on Minimal perfect hash functions with hybrid indexing (+PGM Index) for Rust.
Zero collisions. PureO(1)performance.
๐ง Algorithm
Built on the BDZ algorithm using 3-hypergraph peeling:
- Maps exactly
nkeys tonunique indices - Uses hypergraph construction with 3 vertices per key
- Peels vertices of degree 1 iteratively until fully resolved
- Falls back to rehashing with different salts if cycles occur
๐ Research: Simple and Space-Efficient Minimal Perfect Hash Functions (Botelho, Pagh, Ziviani, 2007)
For numeric data, we layer in PGM (Piecewise Geometric Model) indexing:
- Learns linear segments to approximate key distributions
- Provides
O(log log n)worst-case withO(1)average lookups - Excellent for sorted integer sequences with predictable patterns
๐ Research: The PGM-index: a fully-dynamic compressed learned index (Ferragina, Vinciguerra, 2020)
๐ Quick Start
use Builder;
๐ฏ Hybrid Indexing
For mixed workloads, use the hybrid index that automatically partitions keys:
use HybridBuilder;
๐ Performance Comparison *
Performance benchmarks on 1M keys (Intel Core i7-12700, Windows 11):
| Solution | Build Time | Build Rate | Lookup Time | Lookup Rate | Memory | Use Case |
|---|---|---|---|---|---|---|
| Kira (MPH+PGM) | 395ms โก | 2.5M/s โก | 26ns โก | 37.9M/s โก | 4.2MB โก | Mixed data |
| Kira (MPH) | 977ms | 1.0M/s | 47ns โก | 21.4M/s โก | 5.0MB โก | String keys |
| Redis Hash | 2,100ms | 0.48M/s | 180ns | 5.6M/s | 90MB | General KV |
| LevelDB | 3,400ms | 0.29M/s | 250ns | 4.0M/s | 120MB | Persistent |
| RocksDB | 2,800ms | 0.36M/s | 200ns | 5.0M/s | 110MB | LSM-tree |
| std::HashMap | 850ms | 1.2M/s | 35ns | 28.6M/s | 96MB | In-memory |
| DashMap | 920ms | 1.1M/s | 42ns | 23.8M/s | 102MB | Concurrent |
| BTreeMap | 1,200ms | 0.83M/s | 65ns | 15.4M/s | 72MB | Sorted |
* our benchmarks
๐ Performance Highlights *
Build Speed:
- Kira Hybrid: 5.3ร faster than Redis, 8.6ร faster than LevelDB
- Kira MPH: Competitive with HashMap, 2.1ร faster than Redis
Lookup Speed:
- Kira Hybrid: 6.9ร faster than Redis, 9.6ร faster than LevelDB
- Kira MPH: 3.8ร faster than Redis, 5.3ร faster than LevelDB
Memory Efficiency:
- Kira: 18-50ร smaller than traditional databases
- Kira: 19-24ร smaller than HashMap/concurrent structures
- Zero fragmentation - exact memory requirements known
* our benchmarks
โก Speed Comparison Matrix *
| Metric | vs Redis | vs LevelDB | vs RocksDB | vs HashMap | vs BTreeMap |
|---|---|---|---|---|---|
| Build | ๐ฅ 5.3ร | ๐ฅ 8.6ร | ๐ฅ 7.1ร | ๐ฅ 2.2ร | ๐ฅ 3.0ร |
| Lookup | ๐ฅ 6.9ร | ๐ฅ 9.6ร | ๐ฅ 7.7ร | โก 1.3ร | ๐ฅ 2.5ร |
| Memory | ๐ฅ 21ร | ๐ฅ 29ร | ๐ฅ 26ร | ๐ฅ 23ร | ๐ฅ 17ร |
* our benchmarks
๐งช Benchmarks
Run comprehensive benchmarks across algorithms:
Expected output on modern hardware:
=== kira_kv_engine :: Hybrid MPH+PGM Benchmark ===
๐ฅ TRADITIONAL MPH BENCHMARK
build: 0.977 s (1.0 M keys/s)
lookup: 0.047 s (21.4 M lookups/s)
Memory: 5.0 bytes/key, 24x compression vs HashMap
๐ HYBRID MPH+PGM BENCHMARK
build: 0.395 s (2.5 M keys/s)
lookup: 0.026 s (37.9 M lookups/s)
๐ PGM-ONLY BENCHMARK
build: 0.016 s (62.7 M keys/s)
lookup: 0.022 s (45.5 M lookups/s)
range: 0.781 s (1,281 queries/s)
โ๏ธ Features
Enable optional optimizations:
[]
= { = "0.1", = ["serde", "parallel", "pgm", "simd"] }
serdeโ Serialization support for persistenceparallelโ Multi-threaded construction via rayonpgmโ Learned indexing for numeric keyssimdโ Vectorized operations where available
๐๏ธ Configuration
Fine-tune for your workload:
use ;
/// MPH Only (for strings as a key)
let mph = new
.with_config
.build?;
๐ When to Use
Perfect for:
- Fixed datasets (config files, catalogs, dictionaries)
- Memory-constrained environments
- Predictable
O(1)lookup requirements - Cold start optimization (mmap friendly)
- High-performance lookups in hot paths
Not suitable for:
- Frequently changing key sets
- Unknown dataset sizes
- Write-heavy workloads
๐ Performance Tips
- Use
gamma โ 1.25for datasets > 10M keys - Enable
parallelfeature for build-time speedup - Serialize indices to disk for instant cold starts
- Consider hybrid indexing for mixed numeric/string data
๐ฏ Real-World Use Cases *
Web Servers:
- Route matching: 6.9ร faster than Redis lookups
- Session lookup: 37.9M operations/sec vs Redis 5.6M
Databases:
- Index structures: 29ร less memory than LevelDB
- Cache lookups: Zero allocation after build
Gaming:
- Asset ID lookup: 21.4M lookups/sec
- Player data: Instant cold start from disk
* provided with our performance benchmarks
๐ References
- BDZ Algorithm: Botelho, D., Pagh, R., Ziviani, N. (2007). Simple and Space-Efficient Minimal Perfect Hash Functions. WADS 2007
- PGM Index: Ferragina, P., Vinciguerra, G. (2020). The PGM-index: a fully-dynamic compressed learned index. VLDB 2020
- Learned Indexes: Kraska, T. et al. (2018). The Case for Learned Index Structures. SIGMOD 2018
- 3-Hypergraph Peeling: Peeling Random Planar Maps - Theoretical foundations