tracing-throttle

High-performance log deduplication and rate limiting for the Rust tracing ecosystem.

Overview

tracing-throttle suppresses repetitive or bursty log events in high-volume systems. It helps you:

Reduce I/O bandwidth from repetitive logs
Improve log visibility by filtering noise
Lower storage costs for log aggregation
Prevent log backend overload during traffic spikes

The crate provides a tracing::Layer that deduplicates events based on their signature (level, message, and structured fields) and applies configurable rate limiting policies.

Features

🚀 High Performance: Sharded maps and lock-free operations
🎯 Flexible Policies: Count-based, time-window, exponential backoff, and custom policies
📊 Per-signature Throttling: Events with identical signatures are throttled together
💾 Memory Control: Optional LRU eviction to prevent unbounded memory growth
📈 Observability Metrics: Built-in tracking of allowed, suppressed, and evicted events
🛡️ Fail-Safe Circuit Breaker: Fails open to preserve observability during errors
⏱️ Suppression Summaries: Periodic emission of suppression statistics (coming in v0.2)
🔧 Easy Integration: Drop-in tracing::Layer compatible with existing subscribers

Installation

Add this to your Cargo.toml:

[dependencies]
tracing-throttle = "0.1.1"
tracing = "0.1.41"
tracing-subscriber = "0.3.20"

Quick Start

use tracing_throttle::{TracingRateLimitLayer, Policy};
use tracing_subscriber::prelude::*;
use std::time::Duration;

// Create a rate limit filter with safe defaults
// Defaults: 100 events per signature, 10k max signatures with LRU eviction and a 30 second summary interval.
let rate_limit = TracingRateLimitLayer::new();

// Add it as a filter to your fmt layer
tracing_subscriber::registry()
    .with(tracing_subscriber::fmt::layer().with_filter(rate_limit))
    .init();

// Now your logs are rate limited!
for i in 0..1000 {
    tracing::info!("Processing item {}", i);
}
// Only the first 100 will be emitted

Observability & Metrics

Monitor rate limiting behavior with built-in metrics:

use tracing_throttle::{TracingRateLimitLayer, Policy};

let rate_limit = TracingRateLimitLayer::builder()
    .with_policy(Policy::count_based(100).expect("valid policy"))
    .build()
    .expect("valid config");

// ... after some log events have been processed ...

// Get current metrics
let metrics = rate_limit.metrics();
println!("Events allowed: {}", metrics.events_allowed());
println!("Events suppressed: {}", metrics.events_suppressed());
println!("Signatures evicted: {}", metrics.signatures_evicted());

// Or get a snapshot for calculations
let snapshot = metrics.snapshot();
println!("Total events: {}", snapshot.total_events());
println!("Suppression rate: {:.2}%", snapshot.suppression_rate() * 100.0);

// Check how many unique signatures are being tracked
println!("Tracked signatures: {}", rate_limit.signature_count());

Available Metrics:

events_allowed() - Total events allowed through
events_suppressed() - Total events suppressed
signatures_evicted() - Signatures removed due to LRU eviction
signature_count() - Current number of tracked signatures
suppression_rate() - Ratio of suppressed to total events (0.0 - 1.0)

Use Cases:

Monitor suppression rates in production dashboards
Alert when suppression rate exceeds threshold
Track signature cardinality growth
Observe LRU eviction frequency
Validate rate limiting effectiveness

Fail-Safe Operation

tracing-throttle uses a circuit breaker pattern to prevent cascading failures. If rate limiting operations fail (e.g., panics or internal errors), the library fails open to preserve observability:

Closed: Normal operation, rate limiting active
Open: After threshold failures (default: 5), fails open and allows all events
HalfOpen: After recovery timeout (default: 30s), tests if system has recovered
Fail-Open Strategy: Preserves observability over strict rate limiting

This ensures your logs remain visible during system instability, preventing silent data loss.

Rate Limiting Policies

Count-Based Policy

Allow N events, then suppress all subsequent occurrences:

use tracing_throttle::Policy;

let policy = Policy::count_based(50).expect("max_count must be > 0");
// Allows first 50 events, suppresses the rest

Time-Window Policy

Allow K events within a sliding time window:

use std::time::Duration;
use tracing_throttle::Policy;

let policy = Policy::time_window(10, Duration::from_secs(60))
    .expect("max_events and window must be > 0");
// Allows 10 events per minute

Exponential Backoff Policy

Emit events at exponentially increasing intervals (1st, 2nd, 4th, 8th, 16th, ...):

use tracing_throttle::Policy;

let policy = Policy::exponential_backoff();
// Useful for extremely noisy logs

Custom Policies

Implement the RateLimitPolicy trait for custom behavior:

use tracing_throttle::RateLimitPolicy;
use std::time::Instant;

struct MyCustomPolicy;

impl RateLimitPolicy for MyCustomPolicy {
    fn register_event(&mut self, timestamp: Instant) -> PolicyDecision {
        // Your custom logic here
    }

    fn reset(&mut self) {
        // Reset policy state
    }
}

Memory Management

By default, the layer tracks up to 10,000 unique event signatures with LRU eviction. Each signature uses approximately 150-250 bytes.

Typical memory usage:

10,000 signatures (default): ~1.5-2.5 MB
50,000 signatures: ~7.5-12.5 MB
100,000 signatures: ~15-25 MB

// Increase limit for high-cardinality applications
let rate_limit = TracingRateLimitLayer::builder()
    .with_max_signatures(50_000)
    .build()
    .expect("valid config");

// Monitor usage in production
let sig_count = rate_limit.signature_count();
let evictions = rate_limit.metrics().signatures_evicted();

📖 See detailed memory documentation for:

Memory breakdown and overhead calculations
Signature cardinality analysis and estimation
Configuration guidelines for different use cases
Production monitoring and profiling techniques

Performance

See BENCHMARKS.md for detailed measurements and methodology.

Run benchmarks yourself:

cargo bench --bench rate_limiting

Examples

Run the included examples:

# Basic count-based rate limiting
cargo run --example basic

# Demonstrate different policies
cargo run --example policies

Roadmap

v0.1.0 (MVP)

✅ Completed:

Domain policies (count-based, time-window, exponential backoff)
Basic registry and rate limiter
tracing::Layer implementation
LRU eviction with configurable memory limits
Maximum signature limit with LRU eviction
Input validation (non-zero limits, durations, reasonable max_events)
Observability metrics (events allowed/suppressed, eviction tracking, signature count)
Circuit breaker for fail-safe operation
Memory usage documentation
Comprehensive integration tests
Mock infrastructure with test-helpers feature
Extensive test coverage (unit, integration, and doc tests)
Performance benchmarks (20M ops/sec)
Hexagonal architecture (clean ports & adapters)
CI/CD workflows (test, lint, publish)

v0.1.1 (Production Hardening)