Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
bitcoinleveldb-filter
A low-level implementation of LevelDB-compatible filter blocks (Bloom filters) for bitcoin-rs, written in Rust.
This crate models the filter subsystem used by LevelDB to accelerate point lookups in SSTables (sorted string tables). It provides the primitives necessary to build and query filter blocks, while delegating the concrete filter policy (e.g., Bloom filter configuration) to user-defined implementations.
Overview
LevelDB uses filter blocks (typically Bloom filters) to quickly determine whether a key cannot be present in a particular data block, substantially reducing I/O for negative lookups. Each SSTable has:
- A filter block, consisting of one filter region per data block range.
- A small trailer describing where the filter block lives and how to interpret it.
This crate focuses on the filter block part:
FilterBlockBuilder: constructs an in-memory filter block given a user-supplied filter policy and a stream of keys grouped by block offsets.FilterBlockReader: parses a filter block and uses the policy to answerkey_may_matchqueries for a given block offset.
The design is intentionally low-level and close to the original C++ LevelDB layout, so that filter blocks are bitwise-compatible with LevelDB and can be used in interoperable storage formats (such as Bitcoin Core’s LevelDB-backed chainstate and index databases).
You are expected to plug in a concrete filter policy (generally a Bloom filter) via the FilterPolicy, CreateFilter, and KeyMayMatch traits.
Core Traits
FilterPolicy
FilterPolicy is a marker trait representing a particular filter strategy.
In practice you will implement additional traits (CreateFilter, KeyMayMatch) for the same type, so that a single policy type both creates filters and queries them. The crate’s FilterBlockBuilder and FilterBlockReader store a Box<dyn FilterPolicy>; your type must be downcast or enriched to also implement the necessary behavior.
CreateFilter
This trait encapsulates the construction of a single filter region:
keys: pointer to an array ofSlicevalues (each slice is a key).n: number of keys in the array.dst: output buffer where the concrete filter representation is appended.
Typical implementation for a Bloom filter will:
- Hash each key multiple times.
- Set bits in a bitset sized according to
bits_per_key * n. - Serialize the bitset (and possibly some metadata, such as number of probes) into
dst.
KeyMayMatch
This trait encapsulates querying a single filter region:
key: the key to test.filter: serialized filter region for some key range.
Return semantics:
false: the key is definitely not present in this region.true: the key may be present (false positives allowed; false negatives are not allowed for a correct Bloom filter implementation).
In Bloom-filter terms, key_may_match checks that all probe bits corresponding to the key are set in the underlying bitset.
Filter Block Construction
FilterBlockBuilder
The builder aggregates keys per logical data block and finally emits a single contiguous filter block buffer.
Layout
The produced filter block has the same high-level layout as in LevelDB:
[filter_0][filter_1]...[filter_(n-1)][offset_array][array_offset][base_lg]
filter_i: opaque bytes produced by theFilterPolicyfor rangei.offset_array:nlittle-endianu32values;offset_array[i]is the start offset offilter_iwithin the block.array_offset:u32little-endian, byte offset whereoffset_arraybegins.base_lg:u8, log2 of the filter base (block alignment in bytes) used to compute region indices.
Because the policy is policy-specific, the crate does not prescribe the inner layout of each filter_i region; it merely provides the surrounding indexing structure.
Constructor
Creates a new builder bound to a specific filter policy.
start_block
Indicates that keys subsequently added belong to the data block whose file-level offset is block_offset.
-
FILTER_BASE(constant, external to this snippet) defines the granularity at which filters are grouped. The filter index is computed as:let filter_index = as usize; -
If
filter_indexskips ahead relative to the current number of filters, empty filters are generated for intermediate indices viagenerate_filter().
This mirrors LevelDB’s logic where multiple neighboring data blocks may share the same filter region.
add_key
Adds a key to the current, not-yet-flushed filter batch.
- The raw key bytes are appended to
self.keys. self.startrecords the start index of each key within that buffer.
Slice is assumed to be a lightweight pointer+length structure (from the surrounding codebase), and here it is reinterpreted as a byte slice using from_raw_parts.
finish
Finalizes all filters and returns a Slice view over the resulting buffer.
Process:
- If there are unflushed keys (tracked via
self.start), the builder callsgenerate_filter()once more to produce the lastfilter_iregion. - Append all
filter_offsetsas little-endianu32values. - Append
array_offset(the index wherefilter_offsetsbegins) asu32. - Append
FILTER_BASE_LGasu8.
The returned Slice points into self.result. It is your responsibility to ensure self.result outlives this slice.
Low-level Utilities
put_fixed32
Appends a 32-bit little-endian integer to dst. Used to encode offsets and array_offset.
decode_fixed32
Decodes a 32-bit little-endian integer from the first 4 bytes of src. Caller must ensure at least 4 bytes are available.
Filter Block Querying
FilterBlockReader
The reader parses a serialized filter block and exposes key_may_match per block offset.
policy: same policy implementation used at build time (or at least compatible with the serialized format).data: reference-counted backing storage for the filter block bytes.offset: byte offset of the offsets array.num: number of filter regions.base_lg: log2 of the filter base (i.e.,FILTER_BASE = 1 << base_lg).valid: whether parsing was successful.
Constructor
Parsing algorithm:
- Copy
contentsinto anArc<[u8]>. - If total length
n < 5, mark invalid (insufficient space forarray_offset + base_lg). - Read
base_lgas the last byte (data[n - 1]). - Read
last_word = decode_fixed32(&data[n - 5..n - 1])as the start offset of the offsets array. - Validate
last_word <= n - 5; otherwise mark invalid. - Let
offset = last_wordandnum = (n - 5 - last_word) / 4. - Mark as valid.
This reconstructs the same metadata the builder appended during finish.
key_may_match
Evaluates whether key may be present in the filter associated with the given block_offset.
Process:
-
If
self.validisfalse, conservatively returntrue(cannot safely refute membership). -
Compute the filter index:
let index = as usize; -
If
index >= self.num, returntrue(out-of-range; treat as potential match). -
Read the start and limit offsets for this index from the offsets array:
let idx1 = self.offset + index * 4; let start = decode_fixed32 as usize; let limit = decode_fixed32 as usize; -
If
start == limit, region is empty ⇒ returnfalse. -
If
start <= limit && limit <= self.offset, the filter region isdata[start..limit]. -
Wrap this region in a
Sliceand delegate toself.policy.key_may_match(key, &filter_slice).
The contract is: reader handles indexing and bounds; the policy handles the semantics of the filter.
Bloom Filter Policy Stub
This function is a placeholder intended to produce a concrete Bloom filter policy implementation. In canonical LevelDB, the Bloom filter policy:
-
Uses
k ≈ ln(2) * bits_per_keyhash functions. -
Achieves false positive probability approximately:
[ p \approx (1 - e{-k n / m}){k} ]
where (n) is the number of keys, (m) is the total number of bits ((m = \text{bits_per_key} \times n)).
To make this function functional, you should:
- Define a type, e.g.
struct BloomFilterPolicy { bits_per_key: i32, k: u8, ... }. - Implement
FilterPolicyfor it (marker). - Implement
CreateFilterandKeyMayMatchusing a robust hashing scheme (e.g., MurmurHash or a suitable 64-bit hash folded into multiple probes). - Pack per-region Bloom filter bits and any parameters necessary to decode them.
- Return
Box::new(BloomFilterPolicy { ... })fromnew_bloom_filter_policy.
You must ensure that the encoding/decoding conventions of your policy are self-consistent between builder and reader.
Example Usage
The following is conceptual and assumes you supply a concrete
BloomFilterPolicytype implementingFilterPolicy + CreateFilter + KeyMayMatchand exposing a constructor used bynew_bloom_filter_policy.
Building a Filter Block
use FilterBlockBuilder;
use new_bloom_filter_policy;
// use crate::Slice; // from the surrounding bitcoin-rs/leveldb implementation
Querying a Filter Block
use FilterBlockReader;
use new_bloom_filter_policy;
In an integrated LevelDB-like system, you will:
- Build filter blocks while constructing SSTables.
- Persist filter blocks alongside data and index blocks.
- Reload the filter block with
FilterBlockReaderwhen opening the table. - Use
key_may_matchas a fast pre-check before binary searching or performing disk I/O for the data block.
Safety and Invariants
add_keyusesunsafeto reinterpret aSliceas&[u8]. The correctness of that operation depends onSlicecontaining a valid pointer and length for the lifetime of the builder.decode_fixed32assumes thatsrc.len() >= 4; misuse is undefined behavior.finishreturns aSliceborrowing fromself.result; do not drop or mutateFilterBlockBuilderin ways that invalidateself.resultwhile thatSliceis in use.FilterBlockReader::newcopies the input bytes into anArc<[u8]>, so theSlicepassed to it does not need to outlive the reader.
Integration Context
This crate is part of bitcoin-rs:
- Repository: https://github.com/klebs6/bitcoin-rs
- Likely used by the LevelDB reimplementation or bindings inside that repository to support Bitcoin-compatible storage layouts.
When integrating, keep the following aligned:
FILTER_BASEandFILTER_BASE_LGmust match between the writer and reader.- The same
FilterPolicyimplementation (or a wire-compatible version) must be used to create and query filter regions. - Any on-disk format expectations (e.g., Bitcoin Core’s) must be reflected in the policy’s encoding strategy.
License and Metadata
- License: MIT
- Edition: Rust 2021
- Authors:
klebs <none>
This design emphasizes low-level control and compatibility with existing LevelDB-based ecosystems while remaining idiomatic enough to compose nicely with the rest of bitcoin-rs.