Mappy: Space-Efficient Maplet Data Structures
A Rust implementation of maplets - space-efficient data structures for approximate key-value mappings, based on the research paper "Time To Replace Your Filter: How Maplets Simplify System Design".
Overview
Maplets provide the same space benefits as filters while natively supporting key-value associations with one-sided error guarantees. Unlike traditional filters that only support set membership queries, maplets allow you to associate values with keys and retrieve them during queries.
Key Features
- Space Efficiency: Achieves
O(log 1/ε + v)bits per item where ε is the false-positive rate and v is the value size - Value Support: Native key-value associations with configurable merge operators
- One-Sided Errors: Guarantees
M[k] ≺ m[k]for application-specific ordering relations - Deletion Support: Full support for removing key-value pairs
- Merging: Combine maplets using associative/commutative operators
- Resizing: Dynamic growth with efficient rehashing
- Cache Locality: Optimized memory layout for performance
- Concurrency: Thread-safe operations with lock-free reads
- Quotient Filter: Enhanced slot finding with run detection and shifting support
Architecture
The implementation follows a multi-crate workspace structure:
- mappy-core: Core maplet data structure implementation
- mappy-client: Client library for Rust applications
- mappy-python: Python bindings via PyO3
Quick Start
use ;
// Create a maplet for counting with 1% false-positive rate
let mut maplet = new;
// Insert key-value pairs
maplet.insert.unwrap;
maplet.insert.unwrap;
// Query values (may return approximate results)
let count1 = maplet.query; // Some(5) or Some(5 + other_values)
let count2 = maplet.query; // Some(3) or Some(3 + other_values)
// Check if key exists
let exists = maplet.contains; // true
// Get statistics
let stats = maplet.stats;
println!;
println!;
Advanced Features
Quotient Filter
Enable the quotient-filter feature for enhanced slot finding capabilities with run detection, shifting support, and comprehensive benchmarking:
[]
= { = "0.1.0", = ["quotient-filter"] }
use ;
// Create a maplet with quotient filter support
let maplet = new.unwrap;
// Insert some data
maplet.insert.await.unwrap;
// Find the actual slot where a key's fingerprint is stored
// This handles runs, shifting, and all complex quotient filter logic
let slot = maplet.find_slot_for_key.await;
match slot
Quotient Filter Benefits:
- Precise Slot Finding: Locate exact storage slots considering runs and shifting
- Run Detection: Handle quotient filter runs with multiple fingerprints
- Shifting Support: Account for linear probing and slot shifting
- Debugging Support: Inspect internal storage layout for optimization
- Performance Analysis: Understand memory access patterns and cache behavior
- Comprehensive Testing: 62+ test cases covering all edge cases and scenarios
- Performance Benchmarks: Detailed benchmarks showing 10-60M operations/second
- Python Integration: Full Python bindings for quotient filter features
Comprehensive Testing & Benchmarking
The quotient filter implementation includes extensive testing and benchmarking infrastructure:
Test Coverage (62+ Tests)
- Basic Operations: Insert, query, delete with various data types
- False Positive Rate: Validation of probabilistic accuracy
- Multiset Operations: Counter and aggregation operations
- Run Detection: Advanced slot finding with run handling
- Capacity Management: Load factor and resizing behavior
- Concurrency: Thread-safe operations and race condition testing
- Edge Cases: Boundary conditions and error scenarios
- Hash Functions: AHash, TwoX, Fnv performance comparison
- Memory Usage: Space efficiency and memory optimization
- Advanced Features: Slot finding, run detection, shifting support
Performance Benchmarks
- Insert Performance: 10-17 million operations/second
- Query Performance: 16-45 million operations/second
- Delete Performance: 4-8 million operations/second
- Slot Finding: 24-61 million operations/second
- Hash Function Comparison: AHash (fastest), TwoX (medium), Fnv (slowest)
- Memory Usage: Linear scaling with efficient space utilization
- Load Factor Impact: Performance analysis across different load factors
Running Tests and Benchmarks
# Run all tests with quotient filter features
# Run comprehensive test suite
# Run specific benchmarks
# Run complete test suite with Python support
Python Integration
Full Python bindings are available for the quotient filter features:
# Create Maplet with quotient filter features
=
= # Returns slot index
# Create Engine with quotient filter features
=
=
= # Returns slot index
Python Features:
- Slot Finding:
find_slot_for_key()method for both Maplet and Engine - Error Handling: Proper Python exceptions for invalid operations
- Performance: Same high-performance as Rust implementation
- Concurrency: Thread-safe operations in Python
- Memory Management: Automatic cleanup and resource management
Use Cases
1. K-mer Counting (Computational Biology)
use ;
let mut kmer_counter = new;
// Count k-mers in DNA sequences with high accuracy
2. Network Routing Tables
use ;
let mut routing_table = new;
// Map network prefixes to sets of next-hop routers
3. LSM Storage Engine Index
use ;
let mut sstable_index = new;
// Map keys to SSTable identifiers for efficient lookups
Performance Characteristics
Benchmark Results: Mappy vs Redis
Our comprehensive benchmarks show Mappy significantly outperforms Redis for approximate key-value operations:
| Dataset Size | Operation | Redis Performance | Mappy Performance | Mappy Advantage |
|---|---|---|---|---|
| 100 items | SET/INSERT | 13.9-18.0 ms | 41.9-47.7 µs | ~300x faster |
| 1,000 items | SET/INSERT | 107-130 ms | 414-481 µs | ~250x faster |
| 10,000 items | SET/INSERT | 976-1,244 ms | 4.9-5.7 ms | ~200x faster |
Performance Highlights
- Query Throughput: 200-300x faster than Redis for insert operations
- Memory Efficiency: Space-efficient design with configurable false-positive rates
- Error Control: False-positive rate within 1.5x of configured ε
- Cache Performance: Optimized for sequential access patterns
- Concurrent Access: Thread-safe operations with stable performance under load
Running Benchmarks
# Run Redis comparison benchmarks
# Run all benchmarks
# Run specific benchmark categories
For detailed benchmark documentation, see REDIS_BENCHMARKS.md.
Error Guarantees
Maplets provide the strong maplet property:
m[k] = M[k] ⊕ (⊕ᵢ₌₁ˡ M[kᵢ])
Where Pr[ℓ ≥ L] ≤ ε^L, meaning even when wrong, the result is close to correct.
Documentation
- Technical Documentation - Comprehensive technical guide with architecture diagrams, API reference, and implementation details
- Quotient Filter - Complete guide to quotient filter features, testing, and Python integration
- Documentation Index - Comprehensive index of all documentation and resources
- Benchmark Documentation - Detailed benchmark results and performance analysis
- API Documentation - Auto-generated API documentation
- Research Foundation - Original research papers and theoretical background
License
MIT License - see LICENSE file for details.
References
Based on the research paper:
Bender, M. A., Conway, A., Farach-Colton, M., Johnson, R., & Pandey, P. (2025). Time To Replace Your Filter: How Maplets Simplify System Design. arXiv preprint arXiv:2510.05518.