Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
bitcoinleveldb-histogram
Fast, deterministic histogram statistics for Bitcoin-leveldb style latency and value distributions.
This crate provides a faithful Rust implementation of the histogram used inside LevelDB, adapted for the bitcoinleveldb subsystem of bitcoin-rs. It is designed for profiling and characterizing distributions (e.g., operation latencies, sizes, or fees) with negligible runtime overhead and a compact internal representation.
Features
- Fixed-bucket histogram with
HISTOGRAM_NUM_BUCKETSlogarithmically (LevelDB-style) spaced bucket limitsBUCKET_LIMIT. - Online accumulation of statistics:
min,max- sample count
num - linear sum
sum - quadratic sum
sum_squares
- Derived statistics without storing raw samples:
- arithmetic mean (average)
- standard deviation
- median
- arbitrary percentile
- Deterministic merge of histograms without re-processing samples.
- Human-readable ASCII representation suitable for logs.
no_std-friendly core (usescore::fmt::Write), with logging controlled by the caller's logging setup.
The design is suitable for high-throughput systems where allocating or storing all samples is undesirable. The histogram approximates the distribution using bucket counts and pre-defined bucket limits, providing excellent tradeoffs between precision, memory, and CPU cost.
Data model
Semantics:
min,max: Smallest and largest sample ever added. When empty,minis initialized to the largest bucket limit andmaxis0.0.num: Total number of samples (asf64to match LevelDB and simplify arithmetic with large counts).sum: Sum of all values.sum_squares: Sum of squares of all values, enabling variance computation without storing samples.buckets: Per-bucket counts; bucket boundaries are provided byBUCKET_LIMITand follow LevelDB.
This is an online algorithm: each add call updates all sufficient statistics in O(number_of_buckets) worst-case for bucketing (linear scan) but typically small constant time, and O(1) for the other aggregates.
The standard deviation implementation uses the identity:
[ \operatorname{Var}(X) = E[X^2] - (E[X])^2 ]
with
[ E[X] = \frac{\text{sum}}{\text{num}},\quad E[X^2] = \frac{\text{sum_squares}}{\text{num}} ]
and guarded against small negative values from floating-point error.
Basic usage
Add this crate to your Cargo.toml:
[]
= "0.1.19"
Create a histogram and feed samples:
use Histogram;
Example output (shape only, exact values depend on bucket configuration):
Count: 5 Average: 25.2000 StdDev: 38.42
Min: 1.0000 Median: 10.0000 Max: 100.0000
------------------------------------------------------
[ 0, 1 ) 0 0.000% 0.000%
[ 1, 10 ) 2 40.000% 40.000% ########
[ 10, 100 ) 2 40.000% 80.000% ########
[ 100, 1000 ) 1 20.000% 100.000% ####
Operations
Construction and reset
Histogram::default()initializes an empty histogram: internal counters zeroed,minset to the largest bucket limit,maxto0.0.clear()resets the histogram to this empty state, retaining the allocated bucket array.
Adding samples
- Performs a linear search over
BUCKET_LIMITto locate the bucket indexbwhose range containsvalue. - Increments
buckets[b]. - Updates
min,max,num,sum, andsum_squares.
The linear search mirrors LevelDB: the bucket count is small and the branch-predictable loop is faster than a generic binary search for the typical usage patterns.
Merging histograms
- Component-wise aggregation of sufficient statistics:
num += other.numsum += other.sumsum_squares += other.sum_squares- per-bucket addition:
self.buckets[i] += other.buckets[i]
min/maxcombined as the global extrema across both histograms.- If
other.num == 0.0, merge is a no-op.
This makes histogram aggregation trivial in multi-threaded systems: each worker thread can maintain its own Histogram, then a coordinator merges them at the end, approximating the global distribution with no raw data transfer.
Descriptive statistics
Average
- If
num == 0.0, returns0.0. - Otherwise
sum / num.
Standard deviation
- If
num == 0.0, returns0.0. - Uses the
E[X^2] - (E[X])^2identity (see above). - Clamps negative variance from floating-point error to
0.0beforesqrt.
Mathematically, this computes the population standard deviation, not the Bessel-corrected sample standard deviation.
Percentile & median
median()is a shorthand forpercentile(50.0).percentile(p):- Returns
0.0ifnum <= 0.0. - Clamps
pto[0.0, 100.0]. - Finds the first bucket where the cumulative count crosses
threshold = num * (p / 100.0). - Linearly interpolates inside that bucket based on position within its count range.
- Clamps the interpolated value to
[min, max]to avoid leaking outside the observed range. - If no bucket crosses the threshold due to numerical edge cases, falls back to
max.
- Returns
Thus percentiles are approximated from the histogram. When bucket spacing is logarithmic, lower percentiles are more precise for small values; higher percentiles focus resolution where buckets are dense.
String representation
to_string() produces a compact text summary plus an ASCII-art visualization:
- Header line with
Count,Average, andStdDev. - Second line with
Min,Median, andMax. - A row per non-empty bucket:
- Half-open interval
[left, right). - Absolute count in that bucket.
- Bucket-local percentage of total.
- Cumulative percentage up to and including that bucket.
- A bar of
#characters (20 total marks correspond to 100% of the distribution in a single bucket).
- Half-open interval
This representation is intended for logs and interactive diagnostics.
Logging behavior
The crate uses logging macros (e.g., trace!, debug!, warn!) to expose internal events and edge cases:
trace!at entry/exit of methods, and for detailed computation steps.debug!for noteworthy but non-fatal situations (e.g., percentile on empty histogram).warn!if a percentile outside[0,100]is requested.
Exact logging backend configuration is left to the embedding application; if no logger is configured, logs will be ignored as usual.
Integration examples
Profiling latency in a storage engine
use Histogram;
use ;
Aggregating histograms across threads
use Histogram;
use thread;
Relationship to LevelDB and bitcoin-rs
This crate mirrors the histogram design from LevelDB, making it straightforward to:
- Port profiling and diagnostic tooling from C++ LevelDB to Rust.
- Interpret distributions in
bitcoin-rsusing the same conventions as upstream components.
If you are interacting directly with LevelDB or with systems already using its histogram output, this crate should produce comparable behavior and text representations, subject to differences in bucket configuration.
License
This crate is distributed under the MIT license.
For source code, issues, and contributions, see the parent repository:
- Repository: https://github.com/klebs6/bitcoin-rs