Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
bitcoinleveldb-bloom
LevelDB-compatible Bloom filter implementation and test harness used by the bitcoin-rs LevelDB port. This crate focuses on byte-level compatibility with the original C++ LevelDB BuiltinBloomFilter2 policy while providing a safe Rust façade.
Overview
This crate implements a Bloom filter policy and supporting utilities mirroring LevelDB's built-in Bloom filter (leveldb.BuiltinBloomFilter2). It is intended to be plugged into a LevelDB-compatible storage engine within the bitcoin-rs project, but it can also be used as a standalone Bloom filter implementation over arbitrary byte slices.
Key properties:
- LevelDB-compatible format: Filters produced here are intended to be interoperable with LevelDB implementations using
BuiltinBloomFilter2(same hashing scheme, layout, and trailingkmarker). - Kirsch–Mitzenmacher double hashing: Efficient computation of multiple hash probes from a single 32-bit hash, as in the original LevelDB design.
- Configurable bits-per-key: Construct policies tuned for a target false positive rate / memory footprint trade-off.
- Low-level slice interop: Can construct filters from raw
Slicepointers (for FFI / LevelDB core) or from standard&[u8]slices. - Instrumented with
tracingfor detailed diagnostics in performance and correctness testing.
The crate exposes two main abstractions:
BloomFilterPolicy: aFilterPolicy/CreateFilter/KeyMayMatchimplementation compatible with the LevelDB APIs provided elsewhere inbitcoin-rs.BloomTest: a stateful helper to accumulate keys, build a filter, and empirically measure false positive rates.
Crate Integration
This crate lives inside the bitcoin-rs monorepo:
- Repository: https://github.com/klebs6/bitcoin-rs
It assumes the existence of several traits and types from the surrounding LevelDB port, particularly:
Slice: a lightweight view over*const u8+ length (mirroring LevelDB'sSlice).FilterPolicy,CreateFilter,KeyMayMatch,Named: traits defining the filter behavior and metadata.
If you are using this crate outside the monorepo, you will typically consume it indirectly via the LevelDB layer that defines these traits and the Slice abstraction.
Installation
In your Cargo.toml:
[]
= "0.1.19"
This crate uses Rust 2021 edition and is licensed under MIT.
Core Types
BloomFilterPolicy
BloomFilterPolicy encapsulates the Bloom filter configuration:
bits_per_key: approximate number of bits in the filter per inserted key.k: number of hash probes per key (automatically derived frombits_per_key).
Construction:
Behavior:
- Negative
bits_per_key_is clamped to0, then coerced to at least1bit per key. kis computed ask = floor(bits_per_key * ln(2)) ≈ floor(bits_per_key * 0.69), which approximates the theoretically optimal number of probes and is then clamped to1 ≤ k ≤ 30.
This matches the LevelDB design: choosing k = ln(2) * m/n where m is the number of bits and n is the number of keys minimizes false positives for a given filter size.
Interfaces implemented (from the LevelDB API layer):
Named— returns"leveldb.BuiltinBloomFilter2"for wire compatibility.FilterPolicy— marker trait for LevelDB filter policies.CreateFilter— builds filters from*const Slice.KeyMayMatch— checks potential membership using aSlicekey and filter.
Byte-Slice Interface
The policy exposes a fully byte-based API for independent use:
create_filter_from_bytes:
- Input:
keys: &[&[u8]]— array of byte-slice keys. - Output: appends a filter to
dst:- A fresh region of
byteszero-initialized bytes for the bitset. - A single trailing byte encoding
k.
- A fresh region of
Filter layout:
- Let
num_keys = keys.len(). - Bits per key:
bpk = self.bits_per_key. - Raw bits:
bits = max(64, num_keys * bpk)— small filters are padded to at least 64 bits to avoid pathological false positive rates. - Bytes:
bytes = (bits + 7) / 8andbitsis rounded up to a multiple of 8. dstis extended bybyteszero bytes, then a trailing bytekis pushed.
Hashing (Kirsch–Mitzenmacher double hashing):
- Base hash:
h = leveldb_hash(key, seed = 0xbc9f1d34). - Delta:
delta = (h >> 17) | (h << 15)(rotate right by 17 bits). - For each
probefrom0tok-1:bitpos = (h % bits).array[bitpos / 8] |= 1 << (bitpos % 8).h = h.wrapping_add(delta).
This is exactly the double-hashing construction described by Kirsch and Mitzenmacher, reducing the cost of k independent hashes to a single base hash plus linear updates.
key_may_match_bytes:
- Ensures the filter has at least length 2 and decodes:
bits = (len(filter) - 1) * 8.k = filter[len(filter)-1].
- If
k > 30, the function treats the filter as a match for compatibility with potential future encodings (matching LevelDB's semantics). - Otherwise performs the same double hashing scheme and returns
falseon the first missing bit,trueif allkbits are present.
This provides a canonical, constant-time-per-probe membership test with standard Bloom filter semantics.
BloomTest
BloomTest is a harness around BloomFilterPolicy used to:
- Accumulate keys.
- Build a filter from them.
- Probe membership.
- Empirically estimate false positive rate.
Key methods:
Behavior:
Defaultbuilds a policy withbits_per_key = 10(roughly ~1% false positive rate in classical Bloom filter analysis).add_*methods stage keys inself.keys.buildconverts all staged keys into a single Bloom filter (usingcreate_filter_from_bytes), stores it inself.filter, and clearskeys.matches_*lazily builds the filter ifkeysis not empty before checking membership.false_positive_ratetests 10,000 new keys encoded viaencode_fixed32_intointo a 4-byte buffer, offset by1_000_000_000to avoid overlap with typical test data, counts how many yieldtrueundermatches_slice, and returns the observed fraction.
dump_filter logs a textual visualization of the filter bits, ignoring the trailing k byte, with bits rendered as 1 (set) or . (unset) and space-separated per byte.
Low-Level Utilities
Hashing and Encoding
- Computes the LevelDB Bloom hash for a
Sliceusingleveldb_hashwith seed0xbc9f1d34. - Used as the foundation of
BloomFilterPolicy's hashing scheme.
;
;
- Encode
u32values into 4-byte little-endian representation, mirroring LevelDB's fixed-width integer encoding.
Synthetic Keys and Length Progression
- Encodes
iwithencode_fixed32_to_bytesintobufferand returns aSliceover it. - If
bufferis null, returns an emptySlicebuilt from a null pointer and zero length (for defensive usage patterns).
- Generates a simple sequence for test sizes: increments by 1 below 10, by 10 below 100, by 100 below 1000, then by 1000.
- Useful for iterating over key counts in scalability benchmarks.
Usage Examples
Basic: Build a Bloom Filter from Byte Slices
use BloomFilterPolicy;
Using BloomTest to Empirically Check False Positives
use BloomTest;
Integrating via FilterPolicy Trait Objects
use new_bloom_filter_policy;
// use crate::leveldb::{FilterPolicy, Options, DB}; // from bitcoin-rs LevelDB layer
Mathematical and Algorithmic Notes
The Bloom filter implemented here follows the classical analysis:
- A Bloom filter is a bit vector of length
mwithkhash functions. - After inserting
nkeys, the probability of a false positive is approximately:
[ p \approx \left(1 - e^{-kn/m}\right)^k. ]
Choosing m = bits_per_key * n and k ≈ ln(2) * m/n minimizes p for fixed m, and yields:
[ p_{min} \approx (0.5){bits_per_key \cdot \ln 2} \approx (0.6185){bits_per_key}. ]
Thus:
bits_per_key = 10givesp ≈ 0.8%.- Smaller
bits_per_keytrades memory for increased false positives.
The Kirsch–Mitzenmacher double hashing used in this crate preserves asymptotic false positive characteristics while reducing hash computation to a single base hash plus linear updates.
The filter construction additionally enforces bits ≥ 64 for small n, preventing degenerate configurations where m becomes so small that the asymptotic approximation no longer yields reasonable behavior.
Tracing and Diagnostics
All core operations (construction, hashing, filter building, membership testing) are instrumented with the tracing crate, using log levels:
info!for high-level events (policy construction, false positive statistics),debug!for structural details (filter sizes, reset actions),trace!for per-key and per-bit actions (probing, bit-setting),warn!anderror!for anomalous conditions (invalid parameters, null pointers where unexpected).
Enable appropriate tracing subscribers in your application to inspect filter behavior, verify interoperability, or benchmark performance.
Safety and FFI Considerations
The crate assumes a Slice type akin to LevelDB's, consisting of a raw pointer and length. The unsafe blocks are localized and guarded:
- Null pointers from
Slice::data()with zero length are handled gracefully, treated as empty keys. create_filterconverts*const Slice+ninto a slice withfrom_raw_parts, only whenn > 0andkeysis non-null.- Byte slices are always reconstructed with lengths derived from the
Slicemetadata.
When integrating with C or C++:
- Ensure
Slicelifetime guarantees match those expected by the Rust side. - Ensure that any
*mut u8passed intokeypoints to at least 4 bytes of valid writable memory.
License
This crate is distributed under the MIT License. See the repository for full licensing information.