rustywallet-bloom

Fast and memory-efficient Bloom Filter implementation optimized for cryptocurrency address matching in rustywallet.

Features

Memory efficient: ~1.2 bytes per item at 1% false positive rate
Fast: Uses FNV-1a hash with double hashing technique (only 2 hash computations regardless of k)
No dependencies: Pure Rust implementation with zero external dependencies
Optimized parameters: Automatically calculates optimal bits and hash functions
Thread-safe: Safe for concurrent read operations
Cryptocurrency optimized: Designed for Bitcoin and other cryptocurrency address filtering
Configurable: Adjustable false positive rates from 0.001% to 10%

Installation

Add this to your Cargo.toml:

[dependencies]
rustywallet-bloom = "0.1.0"

Quick Start

use rustywallet_bloom::BloomFilter;

// Create filter for 1 million addresses with 1% false positive rate
let mut bloom = BloomFilter::new(1_000_000, 0.01);

// Insert Bitcoin addresses
bloom.insert("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa");
bloom.insert("bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4");

// Test membership
if bloom.contains("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa") {
    println!("Address might be in the wallet!");
}

Bloom Filter Creation

Basic Creation

use rustywallet_bloom::BloomFilter;

// Create with expected items and false positive rate
let bloom = BloomFilter::new(1_000_000, 0.01); // 1M items, 1% FPR

// Create with custom parameters
let bloom = BloomFilter::with_params(1_000_000, 7, 9_585_059); // items, k, m

Recommended Configurations

// High precision (0.1% FPR) - for critical applications
let precise = BloomFilter::new(1_000_000, 0.001);

// Balanced (1% FPR) - recommended for most use cases
let balanced = BloomFilter::new(1_000_000, 0.01);

// Memory optimized (5% FPR) - for memory-constrained environments
let compact = BloomFilter::new(1_000_000, 0.05);

Membership Testing

let mut bloom = BloomFilter::new(100_000, 0.01);

// Insert addresses
bloom.insert("1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2");
bloom.insert("3J98t1WpEZ73CNmQviecrnyiWrnqRhWNLy");

// Test membership
match bloom.contains("1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2") {
    true => println!("Address MIGHT be in wallet (needs verification)"),
    false => println!("Address is DEFINITELY NOT in wallet"),
}

// Batch testing
let addresses = vec![
    "1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa",
    "bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4",
];

for addr in addresses {
    if bloom.contains(addr) {
        println!("Potential match: {}", addr);
    }
}

False Positive Rates

Understanding false positives is crucial for proper Bloom filter usage:

False Positive Rate	Use Case	Memory Overhead
0.001% (0.00001)	Critical applications	~2.4x
0.01% (0.0001)	High precision needed	~2.0x
0.1% (0.001)	Balanced precision	~1.7x
1% (0.01)	Recommended	~1.2x
5% (0.05)	Memory constrained	~0.8x
10% (0.1)	Very memory constrained	~0.6x

// Calculate actual false positive rate
let bloom = BloomFilter::new(1_000_000, 0.01);
println!("Actual FPR: {:.4}%", bloom.false_positive_rate() * 100.0);

Use Cases

Address Matching in Cryptocurrency Wallets

use rustywallet_bloom::BloomFilter;

struct WalletFilter {
    bloom: BloomFilter,
    addresses: Vec<String>, // Backup for verification
}

impl WalletFilter {
    fn new(expected_addresses: usize) -> Self {
        Self {
            bloom: BloomFilter::new(expected_addresses, 0.01),
            addresses: Vec::new(),
        }
    }
    
    fn add_address(&mut self, address: &str) {
        self.bloom.insert(address);
        self.addresses.push(address.to_string());
    }
    
    fn might_own(&self, address: &str) -> bool {
        self.bloom.contains(address)
    }
    
    fn definitely_owns(&self, address: &str) -> bool {
        self.might_own(address) && self.addresses.contains(&address.to_string())
    }
}

// Usage
let mut wallet = WalletFilter::new(10_000);
wallet.add_address("1A1zP1eP5QGefi2DMPTfTL5SLmv7DivfNa");

// Fast pre-filtering
if wallet.might_own("some_address") {
    // Only do expensive verification if bloom filter says "maybe"
    if wallet.definitely_owns("some_address") {
        println!("Address belongs to wallet!");
    }
}

Transaction Filtering

// Filter relevant transactions from blockchain data
let mut relevant_addresses = BloomFilter::new(50_000, 0.01);

// Add all wallet addresses
for address in wallet_addresses {
    relevant_addresses.insert(&address);
}

// Process blockchain transactions
for transaction in blockchain_transactions {
    let mut is_relevant = false;
    
    // Check outputs
    for output in &transaction.outputs {
        if relevant_addresses.contains(&output.address) {
            is_relevant = true;
            break;
        }
    }
    
    if is_relevant {
        // Process this transaction (with verification)
        process_transaction(transaction);
    }
}

Memory Usage

Items	FPR	Memory	Bits per Item
100K	1%	~120 KB	~9.6
1M	1%	~1.2 MB	~9.6
10M	1%	~12 MB	~9.6
37M	1%	~44 MB	~9.6
100M	1%	~120 MB	~9.6

// Check memory usage
let bloom = BloomFilter::new(1_000_000, 0.01);
println!("Memory usage: {} bytes", bloom.memory_usage());
println!("Memory usage: {:.2} MB", bloom.memory_usage() as f64 / 1_000_000.0);

Performance

Insert: O(k) where k = number of hash functions (typically 7 for 1% FPR)
Lookup: O(k)
Space: O(m) where m = number of bits
Hash computations: Only 2 per operation (uses double hashing)

Benchmarks

On a modern CPU (Intel i7-10700K):

Insert: ~50ns per operation
Lookup: ~45ns per operation
Throughput: ~20M operations/second

API Reference

BloomFilter

impl BloomFilter {
    // Create new bloom filter
    pub fn new(expected_items: usize, false_positive_rate: f64) -> Self
    
    // Create with custom parameters
    pub fn with_params(expected_items: usize, k: usize, m: usize) -> Self
    
    // Insert item
    pub fn insert(&mut self, item: &str)
    
    // Check membership
    pub fn contains(&self, item: &str) -> bool
    
    // Get memory usage in bytes
    pub fn memory_usage(&self) -> usize
    
    // Get actual false positive rate
    pub fn false_positive_rate(&self) -> f64
    
    // Get number of hash functions
    pub fn hash_functions(&self) -> usize
    
    // Get number of bits
    pub fn bit_count(&self) -> usize
    
    // Clear all items
    pub fn clear(&mut self)
    
    // Check if empty
    pub fn is_empty(&self) -> bool
}

Utility Functions

// Calculate optimal parameters
pub fn optimal_params(expected_items: usize, false_positive_rate: f64) -> (usize, usize)

// Calculate memory requirement
pub fn memory_required(expected_items: usize, false_positive_rate: f64) -> usize

Examples

See the examples/ directory for complete examples:

basic_usage.rs - Simple address filtering
wallet_integration.rs - Integration with wallet software
performance_test.rs - Performance benchmarking

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

rustywallet-bloom 0.1.0