tracing_throttle/
lib.rs

1//! # tracing-throttle
2//!
3//! High-performance log deduplication and rate limiting for the `tracing` ecosystem.
4//!
5//! This crate provides a `tracing::Layer` that suppresses repetitive log events based on
6//! configurable policies. Events are deduplicated by their signature (level, message, and
7//! fields), so identical log events are throttled together.
8//!
9//!
10//! ## Quick Start
11//!
12//! ```rust,no_run
13//! use tracing_throttle::{TracingRateLimitLayer, Policy};
14//! use tracing_subscriber::prelude::*;
15//! use std::time::Duration;
16//!
17//! // Use sensible defaults: 100 events, 10k signature limit
18//! let rate_limit = TracingRateLimitLayer::new();
19//!
20//! // Or customize:
21//! let rate_limit = TracingRateLimitLayer::builder()
22//!     .with_policy(Policy::count_based(100).unwrap())
23//!     .with_max_signatures(50_000)  // Custom limit
24//!     .with_summary_interval(Duration::from_secs(30))
25//!     .build()
26//!     .unwrap();
27//!
28//! // Apply the rate limit as a filter to your fmt layer
29//! tracing_subscriber::registry()
30//!     .with(tracing_subscriber::fmt::layer().with_filter(rate_limit))
31//!     .init();
32//! ```
33//!
34//! ## Features
35//!
36//! - **Count-based limiting**: Allow N events, then suppress the rest
37//! - **Time-window limiting**: Allow K events per time period
38//! - **Exponential backoff**: Emit at exponentially increasing intervals (1st, 2nd, 4th, 8th...)
39//! - **Custom policies**: Implement your own rate limiting logic
40//! - **Per-signature throttling**: Different messages are throttled independently
41//! - **LRU eviction**: Optional memory limits with automatic eviction of least recently used signatures
42//! - **Observability metrics**: Built-in tracking of allowed, suppressed, and evicted events
43//! - **Fail-safe circuit breaker**: Fails open during errors to preserve observability
44//!
45//! ## Observability
46//!
47//! Monitor rate limiting behavior with built-in metrics:
48//!
49//! ```rust,no_run
50//! # use tracing_throttle::{TracingRateLimitLayer, Policy};
51//! # let rate_limit = TracingRateLimitLayer::builder()
52//! #     .with_policy(Policy::count_based(100).unwrap())
53//! #     .build()
54//! #     .unwrap();
55//! // Get current metrics
56//! let metrics = rate_limit.metrics();
57//! println!("Events allowed: {}", metrics.events_allowed());
58//! println!("Events suppressed: {}", metrics.events_suppressed());
59//! println!("Signatures evicted: {}", metrics.signatures_evicted());
60//!
61//! // Get snapshot for calculations
62//! let snapshot = metrics.snapshot();
63//! println!("Suppression rate: {:.2}%", snapshot.suppression_rate() * 100.0);
64//! ```
65//!
66//! ## Fail-Safe Operation
67//!
68//! The library uses a circuit breaker to fail open during errors, preserving
69//! observability over strict rate limiting:
70//!
71//! ```rust,no_run
72//! # use tracing_throttle::{TracingRateLimitLayer, CircuitState};
73//! # let rate_limit = TracingRateLimitLayer::new();
74//! // Check circuit breaker state
75//! let cb = rate_limit.circuit_breaker();
76//! match cb.state() {
77//!     CircuitState::Closed => println!("Normal operation"),
78//!     CircuitState::Open => println!("Failing open - allowing all events"),
79//!     CircuitState::HalfOpen => println!("Testing recovery"),
80//! }
81//! ```
82//!
83//! ## Memory Management
84//!
85//! By default, tracks up to 10,000 unique event signatures with LRU eviction.
86//! Each signature uses approximately 150-250 bytes.
87//!
88//! **Typical memory usage:**
89//! - 10,000 signatures (default): ~1.5-2.5 MB
90//! - 50,000 signatures: ~7.5-12.5 MB
91//! - 100,000 signatures: ~15-25 MB
92//!
93//! **Configuration:**
94//! ```rust,no_run
95//! # use tracing_throttle::TracingRateLimitLayer;
96//! // Increase limit for high-cardinality applications
97//! let rate_limit = TracingRateLimitLayer::builder()
98//!     .with_max_signatures(50_000)
99//!     .build()
100//!     .unwrap();
101//!
102//! // Monitor usage
103//! let sig_count = rate_limit.signature_count();
104//! let evictions = rate_limit.metrics().signatures_evicted();
105//! ```
106//!
107//! ### Memory Usage Breakdown
108//!
109//! Each tracked signature consumes memory for:
110//!
111//! ```text
112//! Per-Signature Memory:
113//! ├─ EventSignature (hash key)      ~32 bytes  (u64 hash)
114//! ├─ EventState (value)              ~120-200 bytes
115//! │  ├─ Policy state                 ~40-80 bytes (depends on policy type)
116//! │  ├─ SuppressionCounter           ~40 bytes (atomic counters + timestamp)
117//! │  └─ Metadata overhead            ~40 bytes (DashMap internals)
118//! └─ Total per signature             ~150-250 bytes (varies with policy)
119//! ```
120//!
121//! **Estimated memory usage at different signature limits:**
122//!
123//! | Signatures | Memory (typical) | Memory (worst case) | Use Case |
124//! |------------|------------------|---------------------|----------|
125//! | 1,000      | ~150 KB          | ~250 KB             | Small apps, few event types |
126//! | 10,000 (default) | ~1.5 MB    | ~2.5 MB             | Most applications |
127//! | 50,000     | ~7.5 MB          | ~12.5 MB            | High-cardinality apps |
128//! | 100,000    | ~15 MB           | ~25 MB              | Very large systems |
129//!
130//! **Additional overhead:**
131//! - Metrics: ~100 bytes (atomic counters)
132//! - Circuit breaker: ~200 bytes (state tracking)
133//! - Layer structure: ~500 bytes
134//! - **Total fixed overhead: ~800 bytes**
135//!
136//! ### Signature Cardinality Analysis
137//!
138//! **What affects signature cardinality?**
139//!
140//! Event signatures are computed from `(level, message, fields)`. Your cardinality
141//! depends on how many unique combinations you emit:
142//!
143//! ```rust,no_run
144//! # use tracing::info;
145//! // Low cardinality (good) - same signature for all occurrences
146//! info!("User login successful");  // Always same signature
147//!
148//! // Medium cardinality - signatures vary by field values
149//! # let id = 123;
150//! info!(user_id = %id, "User login");  // One signature per unique user_id
151//!
152//! // High cardinality (danger) - unique signature per event
153//! # let uuid = "abc";
154//! info!(request_id = %uuid, "Processing");  // New signature every time!
155//! ```
156//!
157//! **Cardinality examples:**
158//!
159//! | Pattern | Unique Signatures | Memory Impact |
160//! |---------|-------------------|---------------|
161//! | Static messages only | ~10-100 | Minimal (~10 KB) |
162//! | Messages + stable IDs (user, tenant) | ~1,000-10,000 | Low (1-2 MB) |
163//! | Messages + session IDs | ~10,000-100,000 | Medium (10-25 MB) |
164//! | Messages + request UUIDs | Unbounded | **High risk** |
165//!
166//! **How to estimate your cardinality:**
167//!
168//! 1. **Count unique log templates** in your codebase
169//! 2. **Multiply by field cardinality** (unique values per field)
170//! 3. **Example calculation:**
171//!    - 50 unique log messages
172//!    - 10 severity levels used
173//!    - Average 20 unique user IDs per message
174//!    - **Estimated: 50 × 20 = 1,000 signatures** (✓ well below default)
175//!
176//! ### Configuration Guidelines
177//!
178//! **When to use the default (10k signatures):**
179//! - ✅ Most applications with structured logging
180//! - ✅ Log messages use stable identifiers (user_id, tenant_id, service_name)
181//! - ✅ You're unsure about cardinality
182//! - ✅ Memory is not severely constrained
183//!
184//! **When to increase the limit:**
185//!
186//! ```rust,no_run
187//! # use tracing_throttle::TracingRateLimitLayer;
188//! let rate_limit = TracingRateLimitLayer::builder()
189//!     .with_max_signatures(50_000)  // 5-10 MB overhead
190//!     .build()
191//!     .expect("valid config");
192//! ```
193//!
194//! - ✅ High log volume with many unique event types (>10k)
195//! - ✅ Large distributed system with many services/endpoints
196//! - ✅ You've measured cardinality and need more capacity
197//! - ✅ Memory is available (10+ MB is acceptable)
198//!
199//! **When to use unlimited signatures:**
200//!
201//! ```rust,no_run
202//! # use tracing_throttle::TracingRateLimitLayer;
203//! let rate_limit = TracingRateLimitLayer::builder()
204//!     .with_unlimited_signatures()  // ⚠️ Unbounded memory growth
205//!     .build()
206//!     .expect("valid config");
207//! ```
208//!
209//! - ⚠️ **Use with extreme caution** - can cause unbounded memory growth
210//! - ✅ Controlled environments (short-lived processes, tests)
211//! - ✅ Known bounded cardinality with monitoring in place
212//! - ✅ Memory constraints are not a concern
213//! - ❌ **Never use** if logging includes UUIDs, timestamps, or other high-cardinality data
214//!
215//! ### Monitoring Memory Usage
216//!
217//! **Check signature count in production:**
218//!
219//! ```rust,no_run
220//! # use tracing_throttle::TracingRateLimitLayer;
221//! # use tracing::warn;
222//! # let rate_limit = TracingRateLimitLayer::new();
223//! // In a periodic health check or metrics reporter:
224//! let sig_count = rate_limit.signature_count();
225//! let evictions = rate_limit.metrics().signatures_evicted();
226//!
227//! if sig_count > 8000 {
228//!     warn!("Approaching signature limit: {}/10000", sig_count);
229//! }
230//!
231//! if evictions > 1000 {
232//!     warn!("High eviction rate: {} signatures evicted", evictions);
233//! }
234//! ```
235//!
236//! **Integrate with memory profilers:**
237//!
238//! ```bash
239//! # Use Valgrind Massif for heap profiling
240//! valgrind --tool=massif --massif-out-file=massif.out ./your-app
241//!
242//! # Analyze with ms_print
243//! ms_print massif.out
244//!
245//! # Look for DashMap and EventState allocations
246//! ```
247//!
248//! **Signs you need to adjust signature limits:**
249//!
250//! | Symptom | Likely Cause | Action |
251//! |---------|--------------|--------|
252//! | High eviction rate (>1000/min) | Cardinality > limit | Increase `max_signatures` |
253//! | Memory growth over time | Unbounded cardinality | Fix logging (remove UUIDs), add limit |
254//! | Low signature count (<100) | Over-provisioned | Can reduce limit safely |
255//! | Frequent evictions + suppression | Limit too low | Increase limit or reduce cardinality |
256
257// Domain layer - pure business logic
258pub mod domain;
259
260// Application layer - orchestration
261pub mod application;
262
263// Infrastructure layer - external adapters
264pub mod infrastructure;
265
266// Re-export commonly used types for convenience
267pub use domain::{
268    policy::{
269        CountBasedPolicy, ExponentialBackoffPolicy, Policy, PolicyDecision, PolicyError,
270        RateLimitPolicy, TimeWindowPolicy,
271    },
272    signature::EventSignature,
273    summary::{SuppressionCounter, SuppressionSummary},
274};
275
276pub use application::{
277    circuit_breaker::{CircuitBreaker, CircuitBreakerConfig, CircuitState},
278    emitter::EmitterConfigError,
279    limiter::RateLimiter,
280    metrics::{Metrics, MetricsSnapshot},
281    ports::{Clock, Storage},
282    registry::SuppressionRegistry,
283};
284
285#[cfg(feature = "async")]
286pub use application::emitter::{EmitterHandle, ShutdownError};
287
288pub use infrastructure::{
289    clock::SystemClock,
290    layer::{BuildError, TracingRateLimitLayer, TracingRateLimitLayerBuilder},
291    storage::ShardedStorage,
292};