What is the Aleph Filter?
The Aleph Filter is a probabilistic data structure for approximate membership queries, introduced in the VLDB 2024 paper by Dayan, Bercea, and Pagh. It answers a deceptively simple question:
"Is this element in the set?"
Like Bloom and Cuckoo filters, it may produce false positives but never false negatives. Unlike them, it can expand indefinitely without rebuilding, sacrificing one fingerprint bit per expansion to double its capacity, all while maintaining O(1) operations.
Why does this matter?
| Problem | Traditional Filters | Aleph Filter |
|---|---|---|
| Dataset grows beyond capacity | Rebuild from scratch | Expands automatically |
| Need to delete items | ❌ (Bloom) or limited (Cuckoo) | ✅ Full support |
| FPR degrades as load increases | Yes | Stable by design |
| Query cost grows with expansions | O(log N) (InfiniFilter) | O(1) always |
Quick Start
use AlephFilter;
// Create a filter for ~1000 items with 1% false positive rate
let mut filter = new;
// Insert
filter.insert;
filter.insert;
// Query
assert!; // true — definitely in the set
assert!; // false — definitely NOT in the set
// Delete
filter.delete;
assert!; // gone
// The filter expands automatically as you insert more items.
// No need to pre-size or rebuild!
for i in 0..100_000
Performance
Benchmarked on Apple M2 (ARM64), Rust 1.93.0, --release mode. All filters targeting 1% FPR.
Throughput (ns/op, lower is better)
INSERT (N = 10,000)
Aleph ██████████████████████████████████████████ 71.9 ns ← fastest dynamic
Bloom █████████████████████████████████████████████████████████████████████████████████████████████████████ 193.7 ns
Cuckoo ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 303.7 ns
NEGATIVE LOOKUP (N = 10,000)
Aleph ██████████████████████████ 44.8 ns ← fastest dynamic
Bloom ████████████████████████████████████████████████████████████████████████████████████████ 150.4 ns
Cuckoo ██████████████████████████████████████████████████████████████████████████████████████████████████████ 170.6 ns
DELETE (N = 10,000)
Aleph ██████████████████████████████████████████████████████ 101.6 ns ← 3× faster
Cuckoo █████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 287.6 ns
Bloom — (not supported)
False Positive Rate (target = 1.00%)
| N | Aleph | Bloom | Cuckoo | Xor8 |
|---|---|---|---|---|
| 1K | 0.39% ✅ | 1.04% | 2.98% ⚠️ | 0.42% |
| 10K | 0.43% ✅ | 0.99% | 1.98% ⚠️ | 0.39% |
| 100K | 0.62% ✅ | 1.00% | 2.47% ⚠️ | 0.41% |
Full benchmark results & methodology:
docs/chapters/comparison-and-conclusion.tex
Run Benchmarks Yourself
# Quick report (~30s) — console output + JSON
# Full Criterion statistical benchmarks (~15 min) — HTML reports
How It Works
The Aleph Filter is built on a quotient filter foundation with three key innovations:
┌─────────────────────────────────────────────────────────────────────┐
│ Hash Function (xxHash3) │
│ 64-bit hash │
│ ┌──────────┬────────────────┐ │
│ │ quotient │ fingerprint │ │
│ │ (q bits) │ (r bits) │ │
│ └────┬─────┴───────┬────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌─────────┐ ┌──────────────┐ │
│ │ slot │ │ stored in │ │
│ │ index │ │ slot data │ │
│ └─────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
1. Pivot-Bit Expansion
When the filter exceeds 80% load, it doubles its slot count by:
- Incrementing quotient width by 1 bit (more address bits)
- Sacrificing 1 fingerprint bit per entry (slightly higher FPR)
- Using the sacrificed bit as a "pivot" to distribute entries to new slots
2. Void Entry Duplication
When all fingerprint bits are exhausted (after many expansions), entries become void markers. These are duplicated to both possible canonical slots during expansion, ensuring they're always findable in O(1).
3. Cluster-Aware Deletion
Deletion uses a full cluster-aware compaction algorithm that shifts entries backward to fill gaps, maintaining the quotient filter's structural invariants.
Architecture
src/
├── lib.rs # Public API & module declarations
├── hash.rs # xxHash3 → (quotient, fingerprint) splitting
├── metadata.rs # Per-slot flags: occupied, continuation, shifted
├── slot.rs # 64-bit packed slots: fingerprint + length + markers
└── aleph_filter.rs # Core filter: insert, query, delete, expand
| Module | Responsibility |
|---|---|
hash |
Single xxHash3 call → split into quotient (slot address) and remainder (fingerprint) |
slot |
Pack fingerprint + length into 64 bits, with special void/tombstone markers |
metadata |
3-bit flags per slot tracking cluster/run structure for O(1) lookups |
aleph_filter |
Quotient filter operations, run/cluster management, expansion logic |
Feature Matrix
| Feature | Bloom | Cuckoo | Xor8 | Aleph |
|---|---|---|---|---|
| Dynamic insert | ✅ | ✅ | ❌ | ✅ |
| Delete | ❌ | ✅ | ❌ | ✅ |
| Expand without rebuild | ❌ | ❌ | ❌ | ✅ |
| Stable FPR | ❌ | ❌ | ✅* | ✅ |
| O(1) lookup | O(k) | ✅ | ✅ | ✅ |
| Streaming/unbounded data | ❌ | ❌ | ❌ | ✅ |
* Xor8 is immutable — FPR is fixed at construction time
Documentation
This project includes a comprehensive LaTeX technical report covering:
- Background on probabilistic filters (Bloom, Quotient, Cuckoo)
- Quotient filter expansion and fingerprint splitting
- The Aleph Filter algorithm in detail
- Our Rust implementation design decisions
- Comparative benchmarks with analysis
# Build the PDF (requires LaTeX installation)
&&
Pre-built PDF:
docs/aleph_filter_explained.pdf
Testing
# Run all tests
# Run with output (see FPR measurements, expansion traces)
# Run a specific test
The test suite covers:
- ✅ No false negatives (critical invariant)
- ✅ FPR within target bounds
- ✅ Deletion correctness
- ✅ Expansion survival (all keys findable post-expand)
- ✅ Extreme expansion (50K inserts from 4 slots)
- ✅ Interleaved insert/delete/query
- ✅ Edge cases (empty keys, long keys, binary keys, duplicates)
References
| Resource | Link |
|---|---|
| Paper | Aleph Filter: To Infinity in Constant Time (VLDB 2024) |
| Authors | Niv Dayan, Ioana Bercea, Rasmus Pagh |
| Java Reference | github.com/nivdayan/AlephFilter |
| This Implementation | github.com/NPX2218/aleph-filter |
License
MIT © 2026 Neel Bansal