Crate tracing_throttle

Crate tracing_throttle 

Source
Expand description

§tracing-throttle

High-performance log deduplication and rate limiting for the tracing ecosystem.

This crate provides a tracing::Layer that suppresses repetitive log events based on configurable policies. Events are deduplicated by their signature (level, target, and message). Event field values are NOT included in signatures by default - use .with_event_fields() to include specific fields.

§Quick Start

use tracing_throttle::{TracingRateLimitLayer, Policy};
use tracing_subscriber::prelude::*;
use std::time::Duration;

// Use sensible defaults: 50 burst capacity, 1 token/sec (60/min), 10k signature limit
let rate_limit = TracingRateLimitLayer::new();

// Or customize for high-volume applications:
let rate_limit = TracingRateLimitLayer::builder()
    .with_policy(Policy::token_bucket(100.0, 10.0).unwrap())  // 100 burst, 600/min
    .with_max_signatures(50_000)  // Custom limit
    .with_summary_interval(Duration::from_secs(30))
    .build()
    .unwrap();

// Apply the rate limit as a filter to your fmt layer
tracing_subscriber::registry()
    .with(tracing_subscriber::fmt::layer().with_filter(rate_limit))
    .init();

§Features

§Rate Limiting Policies

  • Token bucket limiting: Burst tolerance with smooth recovery (recommended default)
  • Time-window limiting: Allow K events per time period with natural reset
  • Count-based limiting: Allow N events, then suppress the rest (no recovery)
  • Exponential backoff: Emit at exponentially increasing intervals (1st, 2nd, 4th, 8th…)
  • Custom policies: Implement your own rate limiting logic

§Eviction Strategies

  • LRU eviction: Evict least recently used signatures (default)
  • Priority-based: Custom priority functions to keep important events (ERROR over INFO)
  • Memory-based: Enforce byte limits with automatic memory tracking
  • Combined: Use both priority and memory constraints together

§Other Features

  • Per-signature throttling: Different messages are throttled independently
  • Observability metrics: Built-in tracking of allowed, suppressed, and evicted events
  • Fail-safe circuit breaker: Fails open during errors to preserve observability

§Event Signatures

Events are deduplicated based on their signature. By default, signatures include:

  • Event level (INFO, WARN, ERROR, etc.)
  • Target (module path)
  • Message text

Event field VALUES are NOT included by default. This means:

info!(user_id = 1, "Login");  // Signature: (INFO, target, "Login")
info!(user_id = 2, "Login");  // SAME signature - will be rate limited together!

To rate-limit events per field value, use .with_event_fields():

let layer = TracingRateLimitLayer::builder()
    .with_event_fields(vec!["user_id".to_string()])  // Include user_id in signature
    .build()
    .unwrap();

Now each user_id gets its own rate limit:

info!(user_id = 1, "Login");  // Signature: (INFO, target, "Login", user_id=1)
info!(user_id = 2, "Login");  // Signature: (INFO, target, "Login", user_id=2)

See tests/event_fields.rs for complete examples.

§Observability

Monitor rate limiting behavior with built-in metrics:

// Get current metrics
let metrics = rate_limit.metrics();
println!("Events allowed: {}", metrics.events_allowed());
println!("Events suppressed: {}", metrics.events_suppressed());
println!("Signatures evicted: {}", metrics.signatures_evicted());

// Get snapshot for calculations
let snapshot = metrics.snapshot();
println!("Suppression rate: {:.2}%", snapshot.suppression_rate() * 100.0);

§Eviction Strategies

Control which event signatures are kept when storage limits are reached:

§LRU (Default)

let layer = TracingRateLimitLayer::builder()
    .with_max_signatures(10_000)  // Uses LRU eviction by default
    .build()
    .unwrap();

§Priority-Based

Keep important events (ERROR) over less important ones (INFO):

let layer = TracingRateLimitLayer::builder()
    .with_max_signatures(5_000)
    .with_eviction_strategy(EvictionStrategy::Priority(Arc::new(|_sig, state| {
        match state.metadata.as_ref().map(|m| m.level.as_str()) {
            Some("ERROR") => 100,
            Some("WARN") => 50,
            Some("INFO") => 10,
            _ => 5,
        }
    })))
    .build()
    .unwrap();

§Memory-Based

Enforce memory limits with automatic tracking:

let layer = TracingRateLimitLayer::builder()
    .with_eviction_strategy(EvictionStrategy::Memory {
        max_bytes: 5 * 1024 * 1024,  // 5MB limit
    })
    .build()
    .unwrap();

§Combined

Use both priority and memory constraints:

let layer = TracingRateLimitLayer::builder()
    .with_eviction_strategy(EvictionStrategy::PriorityWithMemory {
        priority_fn: Arc::new(|_sig, state| {
            match state.metadata.as_ref().map(|m| m.level.as_str()) {
                Some("ERROR") => 100,
                _ => 10,
            }
        }),
        max_bytes: 10 * 1024 * 1024,
    })
    .build()
    .unwrap();

See examples/eviction.rs for complete working examples.

§Fail-Safe Operation

The library uses a circuit breaker to fail open during errors, preserving observability over strict rate limiting:

// Check circuit breaker state
let cb = rate_limit.circuit_breaker();
match cb.state() {
    CircuitState::Closed => println!("Normal operation"),
    CircuitState::Open => println!("Failing open - allowing all events"),
    CircuitState::HalfOpen => println!("Testing recovery"),
}

§Memory Management

By default, tracks up to 10,000 unique event signatures with LRU eviction. Each signature uses approximately 200-400 bytes (includes event metadata for summaries).

Typical memory usage:

  • 10,000 signatures (default): ~2-4 MB
  • 50,000 signatures: ~10-20 MB
  • 100,000 signatures: ~20-40 MB

Configuration:

// Increase limit for high-cardinality applications
let rate_limit = TracingRateLimitLayer::builder()
    .with_max_signatures(50_000)
    .build()
    .unwrap();

// Monitor usage
let sig_count = rate_limit.signature_count();
let evictions = rate_limit.metrics().signatures_evicted();

§Memory Usage Breakdown

Each tracked signature consumes memory for:

Per-Signature Memory:
├─ EventSignature (hash key)      ~32 bytes  (u64 hash)
├─ EventState (value)              ~170-370 bytes
│  ├─ Policy state                 ~40-80 bytes (depends on policy type)
│  ├─ SuppressionCounter           ~40 bytes (atomic counters + timestamp)
│  ├─ EventMetadata (Optional)     ~50-200 bytes (level, message, target, fields)
│  │  ├─ Level string              ~8 bytes
│  │  ├─ Message string            ~20-100 bytes (depends on message length)
│  │  ├─ Target string             ~20-50 bytes (module path)
│  │  └─ Fields (BTreeMap)         ~0-50 bytes (depends on field count)
│  └─ Metadata overhead            ~40 bytes (DashMap internals)
└─ Total per signature             ~200-400 bytes (varies with policy & message length)

Estimated memory usage at different signature limits:

SignaturesMemory (typical)Memory (worst case)Use Case
1,000~200 KB~400 KBSmall apps, few event types
10,000 (default)~2 MB~4 MBMost applications
50,000~10 MB~20 MBHigh-cardinality apps
100,000~20 MB~40 MBVery large systems

Additional overhead:

  • Metrics: ~100 bytes (atomic counters)
  • Circuit breaker: ~200 bytes (state tracking)
  • Layer structure: ~500 bytes
  • Total fixed overhead: ~800 bytes

§Signature Cardinality Analysis

What affects signature cardinality?

By default, signatures are computed from (level, target, message) only. Field values are NOT included unless configured with .with_event_fields().

// Low cardinality (good) - same signature for all occurrences
info!("User login successful");  // Always same signature
info!(user_id = 123, "User login");  // SAME signature (user_id not included by default)

// Medium cardinality - if you configure .with_event_fields(vec!["user_id".to_string()])
info!(user_id = %id, "User login");  // One signature per unique user_id

// High cardinality (danger) - if you configure .with_event_fields(vec!["request_id".to_string()])
info!(request_id = %uuid, "Processing");  // New signature every time!

Cardinality examples:

PatternConfigUnique SignaturesMemory Impact
Static messages onlyDefault~10-100Minimal (~10 KB)
Messages with fieldsDefault (fields ignored)~10-100Minimal (~10 KB)
.with_event_fields(["user_id"])Stable IDs~1,000-10,000Low (1-2 MB)
.with_event_fields(["session_id"])Session IDs~10,000-100,000Medium (10-25 MB)
.with_event_fields(["request_id"])UUIDsUnboundedHigh risk

How to estimate your cardinality:

  1. Count unique log templates in your codebase
  2. Multiply by field cardinality (unique values per field)
  3. Example calculation:
    • 50 unique log messages
    • 10 severity levels used
    • Average 20 unique user IDs per message
    • Estimated: 50 × 20 = 1,000 signatures (✓ well below default)

§Configuration Guidelines

When to use the default (10k signatures):

  • ✅ Most applications with structured logging
  • ✅ Log messages use stable identifiers (user_id, tenant_id, service_name)
  • ✅ You’re unsure about cardinality
  • ✅ Memory is not severely constrained

When to increase the limit:

let rate_limit = TracingRateLimitLayer::builder()
    .with_max_signatures(50_000)  // 5-10 MB overhead
    .build()
    .expect("valid config");
  • ✅ High log volume with many unique event types (>10k)
  • ✅ Large distributed system with many services/endpoints
  • ✅ You’ve measured cardinality and need more capacity
  • ✅ Memory is available (10+ MB is acceptable)

When to use unlimited signatures:

let rate_limit = TracingRateLimitLayer::builder()
    .with_unlimited_signatures()  // ⚠️ Unbounded memory growth
    .build()
    .expect("valid config");
  • ⚠️ Use with extreme caution - can cause unbounded memory growth
  • ✅ Controlled environments (short-lived processes, tests)
  • ✅ Known bounded cardinality with monitoring in place
  • ✅ Memory constraints are not a concern
  • Never use if logging includes UUIDs, timestamps, or other high-cardinality data

§Monitoring Memory Usage

Check signature count in production:

// In a periodic health check or metrics reporter:
let sig_count = rate_limit.signature_count();
let evictions = rate_limit.metrics().signatures_evicted();

if sig_count > 8000 {
    warn!("Approaching signature limit: {}/10000", sig_count);
}

if evictions > 1000 {
    warn!("High eviction rate: {} signatures evicted", evictions);
}

Integrate with memory profilers:

# Use Valgrind Massif for heap profiling
valgrind --tool=massif --massif-out-file=massif.out ./your-app

# Analyze with ms_print
ms_print massif.out

# Look for DashMap and EventState allocations

Signs you need to adjust signature limits:

SymptomLikely CauseAction
High eviction rate (>1000/min)Cardinality > limitIncrease max_signatures
Memory growth over timeUnbounded cardinalityFix logging (remove UUIDs), add limit
Low signature count (<100)Over-provisionedCan reduce limit safely
Frequent evictions + suppressionLimit too lowIncrease limit or reduce cardinality

Re-exports§

pub use domain::policy::CountBasedPolicy;
pub use domain::policy::ExponentialBackoffPolicy;
pub use domain::policy::Policy;
pub use domain::policy::PolicyDecision;
pub use domain::policy::PolicyError;
pub use domain::policy::RateLimitPolicy;
pub use domain::policy::TimeWindowPolicy;
pub use domain::policy::TokenBucketPolicy;
pub use domain::signature::EventSignature;
pub use domain::summary::SuppressionCounter;
pub use domain::summary::SuppressionSummary;
pub use application::circuit_breaker::CircuitBreaker;
pub use application::circuit_breaker::CircuitBreakerConfig;
pub use application::circuit_breaker::CircuitState;
pub use application::emitter::EmitterConfigError;
pub use application::limiter::RateLimiter;
pub use application::metrics::Metrics;
pub use application::metrics::MetricsSnapshot;
pub use application::ports::Clock;
pub use application::ports::Storage;
pub use application::registry::SuppressionRegistry;
pub use application::emitter::EmitterHandle;
pub use application::emitter::ShutdownError;
pub use infrastructure::clock::SystemClock;
pub use infrastructure::eviction::EvictionStrategy;
pub use infrastructure::eviction::PriorityFn;
pub use infrastructure::layer::BuildError;
pub use infrastructure::layer::TracingRateLimitLayer;
pub use infrastructure::layer::TracingRateLimitLayerBuilder;
pub use infrastructure::storage::ShardedStorage;
pub use infrastructure::layer::SummaryFormatter;

Modules§

application
Application layer - orchestration of domain logic.
domain
Domain layer - pure business logic with no external dependencies.
infrastructure
Infrastructure layer - external adapters and integrations.