Atomic Cuckoo Filter
A high-performance, lock-free concurrent cuckoo filter implementation in Rust for efficient set membership testing.
Overview
This crate provides a sophisticated implementation of a cuckoo filter - a probabilistic data structure for fast set membership testing. Unlike traditional implementations, this version uses lock-free atomic operations and is designed for high-concurrency environments.
Key Features
✨ Lock-Free Concurrency: All operations use atomic compare-exchange loops instead of traditional locks
🚀 High Performance: Optimized for multi-threaded environments with minimal blocking
🔍 No False Negatives: Items that were inserted are guaranteed to be found
🎯 Controllable False Positives: Configurable fingerprint size to tune accuracy
📦 Space Efficient: ~20-30% less memory usage than Bloom filters for the same false positive rate
🗑️ Deletion Support: Unlike Bloom filters, items can be safely removed
⏱️ Bounded Lookup Time: Always at most 2 bucket checks maximum
🔧 Highly Configurable: Customizable capacity, fingerprint size, bucket size, and eviction limits
Quick Start
Add this to your Cargo.toml:
[]
= "0.1"
Basic Usage
use CuckooFilter;
// Create a filter with default settings
let filter = new;
// Insert items
filter.insert.unwrap;
filter.insert.unwrap;
filter.insert.unwrap;
// Check membership
assert!;
assert!;
assert!;
// Remove items
assert!;
assert!;
// Count occurrences (not meant to be used as a counting filter, but to detect duplicates or hash collisions)
filter.insert.unwrap;
filter.insert.unwrap;
assert_eq!;
println!;
// Unique Insertions (Atomically check and insert items)
// Returns Ok(true) if inserted, Ok(false) if already present
match filter.insert_unique
Custom Configuration
use CuckooFilter;
let filter = builder
.capacity // Target capacity
.fingerprint_size // Bits per fingerprint (4, 8, 16, or 32)
.bucket_size // Fingerprints per bucket
.max_evictions // Maximum eviction chain length
.build
.unwrap;
Custom Hash Functions
use AHasher;
let filter = default
.capacity
.build
.unwrap;
Concurrent Usage
The filter is designed for high-concurrency scenarios:
use CuckooFilter;
use Arc;
use thread;
let filter = new;
// Spawn multiple threads for concurrent operations
let mut handles = vec!;
// Writer threads
for i in 0..4
// Reader threads
for i in 0..4
// Wait for all threads to complete
for handle in handles
println!;
Configuration Options
| Parameter | Description | Valid Values | Default |
|---|---|---|---|
capacity |
Target number of items | Any positive integer | 1,048,576 |
fingerprint_size |
Bits per fingerprint | 4, 8, 16, or 32 | 16 |
bucket_size |
Fingerprints per bucket | Any positive integer | 4 |
max_evictions |
Max eviction chain length | Any integer ≥ 0 | 500 |
Choosing Parameters
Fingerprint Size: Larger fingerprints = fewer false positives but more memory usage
Bucket Size: Larger buckets = Faster inserts (fewer evictions), but slower lookups, and slightly higher FPR
Max Evictions:
- 0 = No evictions (faster but may fail to insert occasionally)
- Higher values = Better space utilization but slower inserts when load factor is high
Concurrency Model
All operations use atomic compare-exchange loops instead of traditional locks, with optimistic concurrency control for read operations. The only exception is when inserting with evictions, where an atomic-based lock is used to ensure consistency.
Error Handling
The main error type is Error::NotEnoughSpace, returned when the filter cannot accommodate more items:
use ;
let small_filter = builder
.capacity
.max_evictions // Disable evictions
.build
.unwrap;
// Fill the filter
for i in 0..20
Testing
Run the test suite:
# Unit tests
# Benchmarks
Safety and Guarantees
- Thread Safety: All operations are thread-safe and can be called concurrently
- Memory Safety: No unsafe code in the public API (uses
parking_lot_coreinternally)
License
This project is licensed under the MIT License - see the LICENSE file for details.