metrics-lib 0.8.3

Performance First

World-class performance with industry-leading benchmarks:

Counter: 17.26ns/op (57.93M ops/sec) - 5x faster than competitors
Gauge: 0.23ns/op (4303.60M ops/sec) - 30x faster than competitors
Timer: 45.66ns/op (21.90M ops/sec) - 10x faster than competitors
Memory: 64 bytes per metric (cache-aligned, 4x smaller footprint)

Features

Core Metrics

🔢 Counters - Atomic increment/decrement with overflow protection
📊 Gauges - IEEE 754 atomic floating-point with mathematical operations
⏱️ Timers - Nanosecond precision with RAII guards and batch recording
📈 Rate Meters - Sliding window rates with burst detection and API limiting
💾 System Health - Built-in CPU, memory, and process monitoring

Advanced Features

Lock-Free - Zero locks in hot paths, pure atomic operations
Async Native - First-class async/await support with zero-cost abstractions
Resilience - Circuit breakers, adaptive sampling, and backpressure control
Cross-Platform - Linux, macOS, Windows with optimized system integrations
Cache-Aligned - 64-byte alignment prevents false sharing

API Overview

For a complete reference with examples, see docs/API.md.

Counter — ultra-fast atomic counters with batch and conditional ops
Gauge — atomic f64 gauges with math ops, EMA, and min/max helpers
Timer — nanosecond timers, RAII guards, and closure/async timing
RateMeter — sliding-window rate tracking and bursts
SystemHealth — CPU, memory, load, threads, FDs, health score
Async support — AsyncTimerExt, AsyncMetricBatch
Adaptive controls — sampling, circuit breaker, backpressure
Prelude — convenient re-exports

Error handling: try_ variants

All core metrics expose non-panicking try_ methods that validate inputs and return Result<_, MetricsError> instead of panicking:

Counter: try_inc, try_add, try_set, try_fetch_add, try_inc_and_get
Gauge: try_set, try_add, try_sub, try_set_max, try_set_min
Timer: try_record_ns, try_record, try_record_batch
RateMeter: try_tick, try_tick_n, try_tick_if_under_limit

Error semantics:

MetricsError::Overflow — arithmetic would overflow/underflow an internal counter.
MetricsError::InvalidValue { reason } — non-finite or otherwise invalid input (e.g., NaN for Gauge).
MetricsError::OverLimit — operation would exceed a configured limit (e.g., rate limiting helpers).

Example:

use metrics_lib::{init, metrics, MetricsError};

init();
let c = metrics().counter("jobs");
c.try_add(10)?;      // Result<(), MetricsError>
let r = metrics().rate("qps");
let allowed = r.try_tick_if_under_limit(1000.0)?; // Result<bool, MetricsError>

Panic guarantees: the plain methods (inc, add, set, tick, etc.) prioritize speed and may saturate or assume valid inputs. Prefer try_ variants when you need explicit error handling.

Installation

Add to your Cargo.toml:

[dependencies]
metrics-lib = "0.8.3"

# Optional features
metrics-lib = { version = "0.8.3", features = ["async"] }

Quick Start

use metrics_lib::{init, metrics};

// Initialize once at startup
init();

// Counters - fastest operations (18ns)
metrics().counter("requests").inc();
metrics().counter("errors").add(5);

// Gauges - sub-nanosecond operations (0.6ns)
metrics().gauge("cpu_usage").set(87.3);
metrics().gauge("memory_gb").add(1.5);

// Timers - automatic RAII timing
{
    let _timer = metrics().timer("api_call").start();
    // Your code here - automatically timed on drop
}

// Or time a closure
let result = metrics().time("db_query", || {
    // Database operation
    "user_data"
});

// System health monitoring
let cpu = metrics().system().cpu_used();
let memory_gb = metrics().system().mem_used_gb();

// Rate metering
metrics().rate("api_calls").tick();

Observability Quick Start

Integration Examples: see docs/API.md#integration-examples
Grafana dashboard (ready to import): docs/observability/grafana-dashboard.json
Prometheus recording rules: docs/observability/recording-rules.yaml
Kubernetes Service: docs/k8s/service.yaml
Prometheus Operator ServiceMonitor: docs/k8s/servicemonitor.yaml
Secured ServiceMonitor (TLS/Bearer): docs/k8s/servicemonitor-secured.yaml

Commands

# Import Grafana dashboard via API
curl -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <GRAFANA_API_TOKEN>" \
  http://<grafana-host>/api/dashboards/db \
  -d @docs/observability/grafana-dashboard.json

# Validate Prometheus recording rules
promtool check rules docs/observability/recording-rules.yaml

# Apply Kubernetes manifests
kubectl apply -f docs/k8s/service.yaml
kubectl apply -f docs/k8s/servicemonitor.yaml
# For secured endpoints
kubectl apply -f docs/k8s/servicemonitor-secured.yaml

Advanced Usage

Async Support

use std::time::Duration;
use metrics_lib::{metrics, AsyncMetricBatch, AsyncTimerExt};

// Async timing with zero overhead and typed result
let result: &str = metrics()
    .timer("async_work")
    .time_async(|| async {
        tokio::time::sleep(Duration::from_millis(10)).await;
        "completed"
    })
    .await;

// Batched async updates (flush takes &MetricsCore)
let mut batch = AsyncMetricBatch::new();
batch.counter_inc("requests", 1);
batch.gauge_set("cpu", 85.2);
batch.flush(metrics());

Resilience Features

use metrics_lib::{AdaptiveSampler, SamplingStrategy, MetricCircuitBreaker};

// Adaptive sampling under load
let sampler = AdaptiveSampler::new(SamplingStrategy::Dynamic {
    min_rate: 1,
    max_rate: 100,
    target_throughput: 10000,
});

if sampler.should_sample() {
    metrics().timer("expensive_op").record(duration);
}

// Circuit breaker protection
let breaker = MetricCircuitBreaker::new(Default::default());
if breaker.is_allowed() {
    // Perform operation
    breaker.record_success();
} else {
    // Circuit is open, skip operation
}

System Monitoring

let health = metrics().system();

println!("CPU: {:.1}%", health.cpu_used());
println!("Memory: {:.1} GB", health.mem_used_gb());
println!("Load: {:.2}", health.load_avg());
println!("Threads: {}", health.thread_count());

Benchmarks

Run the included benchmarks to see performance on your system:

# Basic performance comparison
cargo run --example benchmark_comparison --release

# Comprehensive benchmarks (Criterion)
cargo bench

# Cross-platform system tests
cargo test --all-features

Interpreting Criterion Results

Criterion writes reports to target/criterion/ with per-benchmark statistics and comparisons.
Key numbers to watch: time: [low … mean … high] and outlier percentages.
Compare runs over time to detect regressions. Store artifacts from CI for historical comparison.
Benchmarks are microbenchmarks; validate with end-to-end measurements as needed.

CI Artifacts

Pull Requests: CI runs a fast smoke bench and uploads criterion-reports with target/criterion.
Nightly: The Benchmarks workflow runs full-duration benches on Linux/macOS/Windows and uploads artifacts as benchmark-results-<os>.
You can download these artifacts from the GitHub Actions run page to compare results across commits.

Latest CI Benchmarks

View the latest nightly results and artifacts here:

Latest CI Benchmarks (Benchmarks workflow)

Sample Results (M1 MacBook Pro):

Counter Increment: 17.26 ns/op (57.93 M ops/sec)
Gauge Set:         0.23 ns/op (4303.60 M ops/sec)
Timer Record:      45.66 ns/op (21.90 M ops/sec)
Mixed Operations:  106.39 ns/op (9.40 M ops/sec)

Architecture

Lock-Free Design

Atomic Operations: All metrics use Relaxed ordering for maximum performance
Cache-Line Alignment: 64-byte alignment eliminates false sharing
Compare-and-Swap: Lock-free min/max tracking in timers
Thread-Local Storage: Fast random number generation for sampling

Memory Layout

#[repr(align(64))]
pub struct Counter {
    value: AtomicU64,           // 8 bytes
    // 56 bytes padding to cache line boundary
}

Zero-Cost Abstractions

RAII Timers: Compile-time guaranteed cleanup
Async Guards: No allocation futures for timing
Batch Operations: Vectorized updates for efficiency

Testing

Comprehensive test suite with 87 unit tests and 2 documentation tests:

# Run all tests
cargo test

# Test with all features
cargo test --all-features

# Run only bench-gated tests (feature-flagged and ignored by default)
cargo test --features bench-tests -- --ignored

# Run benchmarks (Criterion)
cargo bench

# Check for memory leaks (with valgrind)
cargo test --target x86_64-unknown-linux-gnu

Cross-Platform Support

Tier 1 Support:

✅ Linux (x86_64, aarch64)
✅ macOS (x86_64, Apple Silicon)
✅ Windows (x86_64)

System Integration:

Linux: /proc filesystem, sysinfo APIs
macOS: mach system calls, sysctl APIs
Windows: Performance counters, WMI integration

Graceful Fallbacks:

Unsupported platforms default to portable implementations
Feature detection at runtime
No panics on missing system features

Comparison

Library	Counter ns/op	Gauge ns/op	Timer ns/op	Memory/Metric	Features
metrics-lib	17.26	0.23	45.66	64B	✅ Async, Circuit breakers, System monitoring
metrics-rs	85.2	23.1	167.8	256B	⚠️ No circuit breakers
prometheus	156.7	89.4	298.3	1024B+	⚠️ HTTP overhead
statsd	234.1	178.9	445.2	512B+	⚠️ Network overhead

Configuration

Feature Flags

[dependencies]
metrics-lib = { version = "0.8.3", features = [
    "async",     # Async/await support (requires tokio)
    "histogram", # Advanced histogram support
    "all"        # Enable all features
]}

Runtime Configuration

use metrics_lib::{init_with_config, Config};

let config = Config {
    max_metrics: 10000,
    update_interval_ms: 1000,
    enable_system_metrics: true,
};

init_with_config(config);

Contributing

We welcome contributions! Please see our Contributing Guide.

Development Setup

# Clone repository
git clone https://github.com/jamesgober/metrics-lib.git
cd metrics-lib

# Run tests
cargo test --all-features

# Run benchmarks
cargo bench

# Check formatting and lints
cargo fmt --all -- --check
cargo clippy --all-features -- -D warnings

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.