Expand description
An implementation of various Approximate Set Membership structures in Rust. Currently included are a standard Bloom Filter, and the simplest kind of Counting Bloom Filter.
§Usage
This crate is on crates.io and
can be used by adding bloom
to the dependencies in your
project’s Cargo.toml
.
[dependencies]
bloom = "0.2.0"
add this to your crate root:
extern crate bloom;
§Bloom Filters
A Bloom Filter is an Approximate Set Membership structure, which means it can track a set of items and check if an item is a member of the set it is tracking. It is able to do this using a much smaller amount of memory than storing the actual items, at the cost of an occasionally indicating that an item is in the set even though it is not. This occurence is called a “False Positive”. A traditional Bloom Filter will never have a “False Negative” however, which would be indicating that an item is not in the set, when in fact it is. The frequency of false positives can be preciecly bounded by setting the size of the filter, and is called the False Positive Rate. Their small memory footprint and absence of false negatives makes BloomFilters suitable for many applications.
§Example Usage
use bloom::{ASMS,BloomFilter};
let expected_num_items = 1000;
// out of 100 items that are not inserted, expect 1 to return true for contain
let false_positive_rate = 0.01;
let mut filter = BloomFilter::with_rate(false_positive_rate,expected_num_items);
filter.insert(&1);
filter.contains(&1); /* true */
filter.contains(&2); /* probably false */
§Counting Bloom Filters
Counting filters allow removal from a Bloom filter without recreating the filter afresh. A counting filter uses an n-bit counter where a standard Bloom Filter uses a single bit. Counting filters can also provide an upper bound on the number of times a particular element has been inserted into the filter. In general 4 bits per element is considered a good size. This will cause the filter to use 4 times as much memory as compared to standard Bloom Filter.
§Example Usage
use bloom::{ASMS,CountingBloomFilter};
// Create a counting filter that uses 4 bits per element and has a false positive rate
// of 0.01 when 100 items have been inserted
let mut cbf:CountingBloomFilter = CountingBloomFilter::with_rate(4,0.01,100);
cbf.insert(&1);
cbf.insert(&2);
assert_eq!(cbf.estimate_count(&1),1);
assert_eq!(cbf.estimate_count(&2),1);
assert_eq!(cbf.insert_get_count(&1),1);
assert_eq!(cbf.estimate_count(&1),2);
assert_eq!(cbf.remove(&1),2);
assert_eq!(cbf.estimate_count(&1),1);
Re-exports§
pub use bloom::BloomFilter;
pub use bloom::optimal_num_hashes;
pub use bloom::needed_bits;
pub use counting::CountingBloomFilter;
pub use valuevec::ValueVec;
Modules§
Traits§
- ASMS
- Stanard filter functions
- Combineable
- Filters than are Combineable can be unioned and intersected
- Intersectable
- Filters that implement this trait can be intersected with filters of the same type to produce a filter that contains the items that have been inserted into both filters.
- Unionable
- Filters that implement this trait can be unioned with filters of the same type to produce a filter that contains the items that have been inserted into either filter.