metrics-lib 1.0.0

High-performance Rust metrics library: sub-2ns counters, sub-1ns gauges, nanosecond timers, tumbling-window rate meters, async timing, adaptive sampling, and system health. Cross-platform with minimal dependencies.
Documentation

Latest local Criterion means (cargo bench --bench metrics_bench --all-features, Windows x86_64, Rust stable). Numbers are for the cached-handle hot-path pattern (hold an Arc<Counter> / Arc<Gauge> / … and call .inc() / .set() directly):

  • Counter increment: ~1.5 ns/op
  • Gauge set: ~0.4 ns/op
  • Timer record: ~3 ns/op
  • Histogram observe: ~10 ns/op (depends on bucket count)
  • Memory: 64 bytes per metric (cache-aligned)

Calling metrics().counter("name").inc() per call (global lookup) is slower — it pays for an RwLock::read() + HashMap::get(&str) + Arc::clone(). The cached_vs_global Criterion group reports both numbers side-by-side; cache the Arc in hot loops.

Features

Core Metrics

  • 🔢 Counters — atomic increment/decrement with overflow-safe try_* variants
  • 📊 Gauges — IEEE 754 atomic floating-point with non-finite guards
  • ⏱️ Timers — nanosecond precision with RAII guards and batch recording
  • 📈 Rate Meters — tumbling-window rates with burst detection and API limiting
  • 📐 Histograms (v0.9.3) — bucketed observations with sum/count + quantile estimation
  • 🏷️ Labels (v0.9.3)LabelSet with bounded cardinality cap (default 10 000)
  • 💾 System Health — background-sampled CPU / memory / load / threads / FDs / health score (v0.9.4: zero-contention reads)

Telemetry & Exporters (v0.9.3+)

Five built-in exporters render the registry into the format your backend speaks:

Backend Module Feature flag Output
Prometheus text metrics_lib::exporters::prometheus (always on) Stringtext/plain; version=0.0.4
OpenMetrics text metrics_lib::exporters::openmetrics (always on) Stringapplication/openmetrics-text
JSON snapshot metrics_lib::exporters::json serde RegistrySnapshot / String
StatsD UDP push metrics_lib::exporters::statsd statsd UDP datagrams via StatsdSink (DogStatsD tags)
OTLP/HTTP+JSON metrics_lib::exporters::otlp otlp (→ serde) String — POST to /v1/metrics

All exporters honour LabelSet and MetricMetadata (help text + unit + kind) — # HELP / # TYPE / # UNIT lines, OTLP description / unit, StatsD tags.

End-to-end runnable demos: labels_demo, histogram_latency, prometheus_endpoint, statsd_push, otlp_push, snapshot_serde.

Advanced Features

  • Hot-path lock-free — pure atomic operations on every increment/record/observe
  • Async Native — first-class async/await support with zero-cost abstractions
  • Resilience — circuit breakers, adaptive sampling, backpressure control
  • Cross-Platform — Linux (/proc), macOS, Windows (sysinfo)
  • Cache-Aligned — 64-byte alignment prevents false sharing

API Overview

For a complete reference with examples, see docs/API.md.

  • Counter — ultra-fast atomic counters with batch and conditional ops
  • Gauge — atomic f64 gauges with math ops, EMA, and min/max helpers
  • Timer — nanosecond timers, RAII guards, and closure/async timing
  • RateMeter — tumbling-window rate tracking and bursts
  • Histogram — bucketed observations with sum/count and approximate quantiles (v0.9.3)
  • LabelSet — labeled metric instances with bounded cardinality (v0.9.3)
  • SystemHealth — CPU, memory, load, threads, FDs, health score
  • Exporters — Prometheus, OpenMetrics, JSON snapshot, StatsD UDP, OTLP/HTTP+JSON (v0.9.3)
  • Async supportAsyncTimerExt, AsyncMetricBatch
  • Adaptive controls — sampling, circuit breaker, backpressure
  • Prelude — convenient re-exports

Error handling: try_ variants

All core metrics expose non-panicking try_ methods that validate inputs and return Result<_, MetricsError> instead of panicking:

  • Counter: try_inc, try_add, try_set, try_fetch_add, try_inc_and_get
  • Gauge: try_set, try_add, try_sub, try_set_max, try_set_min
  • Timer: try_record_ns, try_record, try_record_batch
  • RateMeter: try_tick, try_tick_n, try_tick_if_under_limit

Error semantics:

  • MetricsError::Overflow — arithmetic would overflow/underflow an internal counter.
  • MetricsError::InvalidValue { reason } — non-finite or otherwise invalid input (e.g., NaN for Gauge).
  • MetricsError::OverLimit — operation would exceed a configured limit (e.g., rate limiting helpers).

Example:

use metrics_lib::{init, metrics, MetricsError};

init();
let c = metrics().counter("jobs");
c.try_add(10)?;      // Result<(), MetricsError>
let r = metrics().rate("qps");
let allowed = r.try_tick_if_under_limit(1000.0)?; // Result<bool, MetricsError>

Panic guarantees: the plain methods (inc, add, set, tick, etc.) prioritize speed and may saturate or assume valid inputs. Prefer try_ variants when you need explicit error handling.

Installation

Add to your Cargo.toml:

[dependencies]
metrics-lib = "1.0.0"

# Optional features
metrics-lib = { version = "1.0.0", features = ["async"] }

# Full feature set (stable + async + serde)
metrics-lib = { version = "1.0.0", features = ["full"] }

Quick Start

use metrics_lib::{init, metrics};

// Initialize once at startup
init();

// Counters
metrics().counter("requests").inc();
metrics().counter("errors").add(5);

// Gauges
metrics().gauge("cpu_usage").set(87.3);
metrics().gauge("memory_gb").add(1.5);

// Timers - automatic RAII timing
{
    let _timer = metrics().timer("api_call").start();
    // Your code here - automatically timed on drop
}

// Or time a closure
let result = metrics().time("db_query", || {
    // Database operation
    "user_data"
});

// System health monitoring
let cpu = metrics().system().cpu_used();
let memory_gb = metrics().system().mem_used_gb();

// Rate metering
metrics().rate("api_calls").tick();

Telemetry & Exporters (v0.9.3+)

Five built-in exporters render the current registry state to popular backends. All exporters share label and metadata support.

use metrics_lib::{init, metrics, LabelSet, Unit};
use metrics_lib::exporters::{prometheus, openmetrics};

init();

// One-time metric descriptions feed `# HELP` / `# TYPE` / `# UNIT` lines.
metrics().registry().describe_counter(
    "http_requests",
    "Total HTTP requests",
    Unit::Custom("1"),
);

// Labeled metrics — `(name, labels)` is the identity.
let labels = LabelSet::from([("method", "GET"), ("status", "200")]);
metrics().counter_with("http_requests", &labels).inc();

// Render the registry to Prometheus text format.
let body = prometheus::render(metrics().registry());
// Or OpenMetrics:
let body_om = openmetrics::render(metrics().registry());
Exporter Feature flag Module Output
Prometheus text (always on) metrics_lib::exporters::prometheus String
OpenMetrics text (always on) metrics_lib::exporters::openmetrics String
JSON snapshot serde metrics_lib::exporters::json String / RegistrySnapshot
StatsD UDP push statsd metrics_lib::exporters::statsd UDP packets via StatsdSink
OTLP/HTTP+JSON otlp (→ serde) metrics_lib::exporters::otlp String (POST to /v1/metrics)

End-to-end examples live in examples/: labels_demo, histogram_latency, prometheus_endpoint, statsd_push, otlp_push, snapshot_serde.

Observability Quick Start

  • Integration Examples: see docs/API.md#integration-examples
  • Grafana dashboard (ready to import): observability/grafana-dashboard.json
  • Prometheus recording rules: observability/recording-rules.yaml
  • Kubernetes Service: docs/k8s/service.yaml
  • Prometheus Operator ServiceMonitor: docs/k8s/servicemonitor.yaml
  • Secured ServiceMonitor (TLS/Bearer): docs/k8s/servicemonitor-secured.yaml

Commands

# Import Grafana dashboard via API
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <GRAFANA_API_TOKEN>" \
  http://<grafana-host>/api/dashboards/db \
  -d @observability/grafana-dashboard.json

# Validate Prometheus recording rules
promtool check rules observability/recording-rules.yaml

# Apply Kubernetes manifests
kubectl apply -f docs/k8s/service.yaml
kubectl apply -f docs/k8s/servicemonitor.yaml
# For secured endpoints
kubectl apply -f docs/k8s/servicemonitor-secured.yaml

Advanced Usage

Async Support

use std::time::Duration;
use metrics_lib::{metrics, AsyncMetricBatch, AsyncTimerExt};

// Async timing with zero overhead and typed result
let result: &str = metrics()
    .timer("async_work")
    .time_async(|| async {
        tokio::time::sleep(Duration::from_millis(10)).await;
        "completed"
    })
    .await;

// Batched async updates (flush takes &MetricsCore)
let mut batch = AsyncMetricBatch::new();
batch.counter_inc("requests", 1);
batch.gauge_set("cpu", 85.2);
batch.flush(metrics());

Examples

Run these self-contained examples to see the library in action:

  • Quick Start

    • File: examples/quick_start.rs
    • Run:
      cargo run --example quick_start --release
      
  • Streaming Rate Window

    • File: examples/streaming_rate_window.rs
    • Run:
      cargo run --example streaming_rate_window --release
      
  • Axum Registry Integration (minimal web service)

    • File: examples/axum_registry_integration.rs
    • Run:
      cargo run --example axum_registry_integration --release
      
    • Endpoints:
      • GET /health — liveness probe
      • GET /metrics-demo — updates metrics (counter/gauge/timer/rate)
      • GET /export — returns a JSON snapshot of selected metrics
  • Quick Tour

    • File: examples/quick_tour.rs
    • Run:
      cargo run --example quick_tour --release
      
  • Async Batch + Timing

    • File: examples/async_batch_timing.rs
    • Run:
      cargo run --example async_batch_timing --release
      
  • Token Bucket Rate Limiter

    • File: examples/token_bucket_limiter.rs
    • Run:
      cargo run --example token_bucket_limiter --release
      
  • Custom Exporter (OpenMetrics-like)

    • File: examples/custom_exporter_openmetrics.rs
    • Run:
      cargo run --example custom_exporter_openmetrics --release
      
  • Axum Middleware Metrics (minimal)

    • File: examples/axum_middleware_metrics.rs
    • Run:
      cargo run --example axum_middleware_metrics --release
      
  • Contention & Admission Demo

    • File: examples/contention_admission.rs
    • Run:
      cargo run --example contention_admission --release
      
  • CPU Stats Overview

    • File: examples/cpu_stats.rs
    • Run:
      cargo run --example cpu_stats --release
      
  • Memory Stats Overview

    • File: examples/memory_stats.rs
    • Run:
      cargo run --example memory_stats --release
      
  • Health Dashboard

    • File: examples/health_dashboard.rs
    • Run:
      cargo run --example health_dashboard --release
      
  • Cache Hit/Miss

    • File: examples/cache_hit_miss.rs
    • Run:
      cargo run --example cache_hit_miss --release
      
  • Broker Throughput

    • File: examples/broker_throughput.rs
    • Run:
      cargo run --example broker_throughput --release
      

More Real-World Examples (API Reference)

Resilience Features

Running many examples quickly

For convenience, a helper script runs a curated set of non-blocking examples sequentially in release mode (skips server examples like Axum middleware):

bash tools/run_examples.sh

You can also pass a custom comma-separated list via EXAMPLES:

EXAMPLES="quick_start,quick_tour,cpu_stats" bash tools/run_examples.sh
use metrics_lib::{AdaptiveSampler, SamplingStrategy, MetricCircuitBreaker};

// Adaptive sampling under load
let sampler = AdaptiveSampler::new(SamplingStrategy::Dynamic {
    min_rate: 1,
    max_rate: 100,
    target_throughput: 10000,
});

if sampler.should_sample() {
    metrics().timer("expensive_op").record(duration);
}

// Circuit breaker protection
let breaker = MetricCircuitBreaker::new(Default::default());
if breaker.is_allowed() {
    // Perform operation
    breaker.record_success();
} else {
    // Circuit is open, skip operation
}

System Monitoring

let health = metrics().system();

println!("CPU: {:.1}%", health.cpu_used());
println!("Memory: {:.1} GB", health.mem_used_gb());
println!("Load: {:.2}", health.load_avg());
println!("Threads: {}", health.thread_count());

Benchmarks

Run the included benchmarks to see performance on your system:

# Basic performance comparison
cargo run --example benchmark_comparison --release

# Comprehensive benchmarks (Criterion)
cargo bench --bench metrics_bench --features meter

# Cross-platform system tests
cargo test --all-features

Interpreting Criterion Results

  • Criterion writes reports to target/criterion/ with per-benchmark statistics and comparisons.
  • Key numbers to watch: time: [low … mean … high] and outlier percentages.
  • Compare runs over time to detect regressions. Store artifacts from CI for historical comparison.
  • Benchmarks are microbenchmarks; validate with end-to-end measurements as needed.

CI Artifacts

  • Pull Requests: CI runs a fast smoke bench and uploads criterion-reports with target/criterion.
  • Nightly: The Benchmarks workflow runs full-duration benches on Linux/macOS/Windows and uploads artifacts as benchmark-results-<os>.
  • You can download these artifacts from the GitHub Actions run page to compare results across commits.

Latest CI Benchmarks

View the latest nightly results and artifacts here:

Latest CI Benchmarks (Benchmarks workflow)

Benchmark history (GitHub Pages):

Benchmark History (gh-pages)

Sample Results (latest local run; Windows x86_64, Rust stable):

Counter Increment: 1.48 ns/op (676.36 M ops/sec)
Gauge Set:         0.40 ns/op (2500.31 M ops/sec)
Timer Record:      3.17 ns/op (314.99 M ops/sec)
Mixed Operations:  151.58 ns/op (6.60 M ops/sec)

Notes: Latest numbers taken from local Criterion means under target/criterion/**/new/estimates.json. Actual throughput varies by CPU and environment; use the GitHub Pages benchmark history for trends.

Methodology

  • Tooling: Criterion with release builds.
  • Flags for stability on local runs: cargo bench --bench metrics_bench --features meter -- -w 3.0 -m 5.0 -n 100 (increase on dedicated runners).
  • Environment disclosure (example):
    • CPU: Apple M1 Pro (performance cores)
    • Rust: stable toolchain
    • Target: aarch64-apple-darwin
    • Governor: default (for CI use a performance governor where applicable)

See also: docs/zero-overhead-proof.md for assembly inspection and binary size analysis, and docs/performance-tuning.md for environment hardening.

Architecture

Lock-Free Design

  • Atomic Operations: All metrics use Relaxed ordering for maximum performance
  • Cache-Line Alignment: 64-byte alignment eliminates false sharing
  • Compare-and-Swap: Lock-free min/max tracking in timers
  • Thread-Local Storage: Fast random number generation for sampling

Memory Layout

#[repr(align(64))]
pub struct Counter {
    value: AtomicU64,           // 8 bytes
    // 56 bytes padding to cache line boundary
}

Zero-Cost Abstractions

  • RAII Timers: Compile-time guaranteed cleanup
  • Async Guards: No allocation futures for timing
  • Batch Operations: Vectorized updates for efficiency

Testing

Comprehensive automated coverage includes:

  • default features: 63 unit tests + 2 API smoke tests + 14 rustdoc tests
  • all features: 110 unit tests + 3 API smoke tests + 17 rustdoc tests
# Run all tests
cargo test

# Test with all features
cargo test --all-features

# Run only bench-gated tests (feature-flagged and ignored by default)
cargo test --features bench-tests -- --ignored

# Run benchmarks (Criterion)
cargo bench --bench metrics_bench --features meter

# Check for memory leaks (with valgrind)
cargo test --target x86_64-unknown-linux-gnu

Cross-Platform Support

Tier 1 Support:

  • ✅ Linux (x86_64, aarch64)
  • ✅ macOS (x86_64, Apple Silicon)
  • ✅ Windows (x86_64)

System Integration:

  • Linux: /proc filesystem, sysinfo APIs
  • macOS: mach system calls, sysctl APIs
  • Windows: Performance counters, WMI integration

Graceful Fallbacks:

  • Unsupported platforms default to portable implementations
  • Feature detection at runtime
  • No panics on missing system features

Performance Notes

Latest local Criterion means (Windows x86_64, Rust stable, release build, held Arc<Counter> / Arc<Gauge> / Arc<Timer> handle — see Methodology above):

Operation ns/op M ops/sec Memory/metric
Counter increment 1.48 676.36 64 B
Gauge set 0.40 2500.31 64 B
Timer record 3.17 314.99 64 B

Calling metrics().counter("name") on every increment is slower than holding the Arc — the global lookup costs an RwLock read + HashMap hit

  • Arc::clone(). Cache the handle in hot loops. A side-by-side bench (global_metrics group in cargo bench) shows the realistic global-lookup cost for comparison.

A populated head-to-head comparison against metrics-rs, prometheus, and statsd will ship with the v1.0.0 release once equivalent test fixtures are in place.

Configuration

Feature Flags

Feature Default Description
count Counter metric type
gauge Gauge metric type
timer Timer metric type
meter Rate meter metric type
sample Statistical sampling
histogram Histogram support (requires sample)
async Async/await support (requires Tokio)
serde Serde serialization support
all All stable features (excludes async and serde)
full All features including async and serde
minimal Smallest useful build (counter only)
# All stable features:
metrics-lib = { version = "1.0.0", features = ["all"] }

# Full build including async and serde:
metrics-lib = { version = "1.0.0", features = ["full"] }

# Minimal build (counter only):
metrics-lib = { version = "1.0.0", features = ["minimal"] }

Runtime Configuration

use metrics_lib::{init_with_config, Config};

let config = Config {
    max_metrics: 10000,
    update_interval_ms: 1000,
    enable_system_metrics: true,
};

init_with_config(config);

Contributing

We welcome contributions! Please see our Contributing Guide.

Development Setup

# Clone repository
git clone https://github.com/jamesgober/metrics-lib.git
cd metrics-lib

# Run tests
cargo test --all-features

# Run benchmarks
cargo bench --bench metrics_bench --features meter

# Check formatting and lints
cargo fmt --all -- --check
cargo clippy --all-features -- -D warnings

Links

Guides