tracing-throttle 0.1.1

High-performance log deduplication and rate limiting for the tracing ecosystem
Documentation

tracing-throttle

Crates.io Documentation License

High-performance log deduplication and rate limiting for the Rust tracing ecosystem.

Overview

tracing-throttle suppresses repetitive or bursty log events in high-volume systems. It helps you:

  • Reduce I/O bandwidth from repetitive logs
  • Improve log visibility by filtering noise
  • Lower storage costs for log aggregation
  • Prevent log backend overload during traffic spikes

The crate provides a tracing::Layer that deduplicates events based on their signature (level, message, and structured fields) and applies configurable rate limiting policies.

Features

  • 🚀 High Performance: Sharded maps and lock-free operations
  • 🎯 Flexible Policies: Count-based, time-window, exponential backoff, and custom policies
  • 📊 Per-signature Throttling: Events with identical signatures are throttled together
  • 💾 Memory Control: Optional LRU eviction to prevent unbounded memory growth
  • 📈 Observability Metrics: Built-in tracking of allowed, suppressed, and evicted events
  • 🛡️ Fail-Safe Circuit Breaker: Fails open to preserve observability during errors
  • ⏱️ Suppression Summaries: Periodic emission of suppression statistics (coming in v0.2)
  • 🔧 Easy Integration: Drop-in tracing::Layer compatible with existing subscribers

Installation

Add this to your Cargo.toml:

[dependencies]
tracing-throttle = "0.1.1"
tracing = "0.1.41"
tracing-subscriber = "0.3.20"

Quick Start

use tracing_throttle::{TracingRateLimitLayer, Policy};
use tracing_subscriber::prelude::*;
use std::time::Duration;

// Create a rate limit filter with safe defaults
// Defaults: 100 events per signature, 10k max signatures with LRU eviction and a 30 second summary interval.
let rate_limit = TracingRateLimitLayer::new();

// Add it as a filter to your fmt layer
tracing_subscriber::registry()
    .with(tracing_subscriber::fmt::layer().with_filter(rate_limit))
    .init();

// Now your logs are rate limited!
for i in 0..1000 {
    tracing::info!("Processing item {}", i);
}
// Only the first 100 will be emitted

Observability & Metrics

Monitor rate limiting behavior with built-in metrics:

use tracing_throttle::{TracingRateLimitLayer, Policy};

let rate_limit = TracingRateLimitLayer::builder()
    .with_policy(Policy::count_based(100).expect("valid policy"))
    .build()
    .expect("valid config");

// ... after some log events have been processed ...

// Get current metrics
let metrics = rate_limit.metrics();
println!("Events allowed: {}", metrics.events_allowed());
println!("Events suppressed: {}", metrics.events_suppressed());
println!("Signatures evicted: {}", metrics.signatures_evicted());

// Or get a snapshot for calculations
let snapshot = metrics.snapshot();
println!("Total events: {}", snapshot.total_events());
println!("Suppression rate: {:.2}%", snapshot.suppression_rate() * 100.0);

// Check how many unique signatures are being tracked
println!("Tracked signatures: {}", rate_limit.signature_count());

Available Metrics:

  • events_allowed() - Total events allowed through
  • events_suppressed() - Total events suppressed
  • signatures_evicted() - Signatures removed due to LRU eviction
  • signature_count() - Current number of tracked signatures
  • suppression_rate() - Ratio of suppressed to total events (0.0 - 1.0)

Use Cases:

  • Monitor suppression rates in production dashboards
  • Alert when suppression rate exceeds threshold
  • Track signature cardinality growth
  • Observe LRU eviction frequency
  • Validate rate limiting effectiveness

Fail-Safe Operation

tracing-throttle uses a circuit breaker pattern to prevent cascading failures. If rate limiting operations fail (e.g., panics or internal errors), the library fails open to preserve observability:

  • Closed: Normal operation, rate limiting active
  • Open: After threshold failures (default: 5), fails open and allows all events
  • HalfOpen: After recovery timeout (default: 30s), tests if system has recovered
  • Fail-Open Strategy: Preserves observability over strict rate limiting

This ensures your logs remain visible during system instability, preventing silent data loss.

Rate Limiting Policies

Count-Based Policy

Allow N events, then suppress all subsequent occurrences:

use tracing_throttle::Policy;

let policy = Policy::count_based(50).expect("max_count must be > 0");
// Allows first 50 events, suppresses the rest

Time-Window Policy

Allow K events within a sliding time window:

use std::time::Duration;
use tracing_throttle::Policy;

let policy = Policy::time_window(10, Duration::from_secs(60))
    .expect("max_events and window must be > 0");
// Allows 10 events per minute

Exponential Backoff Policy

Emit events at exponentially increasing intervals (1st, 2nd, 4th, 8th, 16th, ...):

use tracing_throttle::Policy;

let policy = Policy::exponential_backoff();
// Useful for extremely noisy logs

Custom Policies

Implement the RateLimitPolicy trait for custom behavior:

use tracing_throttle::RateLimitPolicy;
use std::time::Instant;

struct MyCustomPolicy;

impl RateLimitPolicy for MyCustomPolicy {
    fn register_event(&mut self, timestamp: Instant) -> PolicyDecision {
        // Your custom logic here
    }

    fn reset(&mut self) {
        // Reset policy state
    }
}

Memory Management

By default, the layer tracks up to 10,000 unique event signatures with LRU eviction. Each signature uses approximately 150-250 bytes.

Typical memory usage:

  • 10,000 signatures (default): ~1.5-2.5 MB
  • 50,000 signatures: ~7.5-12.5 MB
  • 100,000 signatures: ~15-25 MB
// Increase limit for high-cardinality applications
let rate_limit = TracingRateLimitLayer::builder()
    .with_max_signatures(50_000)
    .build()
    .expect("valid config");

// Monitor usage in production
let sig_count = rate_limit.signature_count();
let evictions = rate_limit.metrics().signatures_evicted();

📖 See detailed memory documentation for:

  • Memory breakdown and overhead calculations
  • Signature cardinality analysis and estimation
  • Configuration guidelines for different use cases
  • Production monitoring and profiling techniques

Performance

See BENCHMARKS.md for detailed measurements and methodology.

Run benchmarks yourself:

cargo bench --bench rate_limiting

Examples

Run the included examples:

# Basic count-based rate limiting
cargo run --example basic

# Demonstrate different policies
cargo run --example policies

Roadmap

v0.1.0 (MVP)

Completed:

  • Domain policies (count-based, time-window, exponential backoff)
  • Basic registry and rate limiter
  • tracing::Layer implementation
  • LRU eviction with configurable memory limits
  • Maximum signature limit with LRU eviction
  • Input validation (non-zero limits, durations, reasonable max_events)
  • Observability metrics (events allowed/suppressed, eviction tracking, signature count)
  • Circuit breaker for fail-safe operation
  • Memory usage documentation
  • Comprehensive integration tests
  • Mock infrastructure with test-helpers feature
  • Extensive test coverage (unit, integration, and doc tests)
  • Performance benchmarks (20M ops/sec)
  • Hexagonal architecture (clean ports & adapters)
  • CI/CD workflows (test, lint, publish)

v0.1.1 (Production Hardening)

Completed:

  • Graceful shutdown for async emitter with EmitterHandle
  • Comprehensive shutdown testing (12 dedicated shutdown tests)
  • Explicit shutdown requirement (no Drop implementation to prevent race conditions)
  • Final emission support on shutdown
  • Panic safety for emit functions with proper resource cleanup
  • Structured error handling with ShutdownError enum
  • Shutdown timeout support (default 10s, customizable)
  • Improved error handling and diagnostics
  • Biased shutdown signal prioritization for fast shutdown
  • Cancellation safety documentation

v0.2.0 (Enhanced Observability)

  • Active suppression summary emission
  • Metrics adapters (Prometheus/OTel)
  • Configurable summary formatting
  • Memory usage telemetry

v0.3.0 (Advanced Features)

  • Pluggable storage backends (Redis, etc.)
  • Streaming-friendly summaries
  • Rate limit by span context
  • Advanced eviction policies

v1.0.0 (Stable Release)

  • Stable API guarantee
  • Production-ready documentation
  • Optional WASM support
  • Performance regression testing

Contributing

Contributions are welcome! Please open issues or pull requests on GitHub.

License

Licensed under the MIT License.