When emitting high-frequency metrics, you often want to aggregate multiple observations into a single metric entry rather than emitting each one individually. This crate provides an aggregation system for metrique that collects observations and emits them as distributions, sums, or other aggregate forms.
When to Use Aggregation
Consider aggregation when:
- High-frequency, low-level events: TLS handshakes, storage operations, or other infrastructure-level metrics
- Fan-out operations: A single unit of work spawns multiple sub-operations you want to aggregate
- Background processing: Queue workers that generate one metric per processed item at an extremely high rate
Sampling raw records is often a better approach than aggregation (and should often be combined with aggregation!) Preserving raw records can make it much easier to debug issues.
The power of metrique and metrique aggregation allow you do to both:
- Emit a sampled set of raw events (e.g. with congressional sampling to ensure all errors are preserved) to one sink
- Emit aggregated events to a second sink
This can let you get the best of both worlds. For more info on this pattern, see the split example.
Examples
This crate includes several complete examples:
embedded- Distributed query withAggregate<T>sink_level- Queue processor withWorkerSinkandKeyedAggregatorsplit-TeeSinkpattern showing aggregation + raw eventshistogram- Histogram usage patterns and strategies
Run examples with: cargo run --example <name>
Quick Start
Embedded Aggregation
Use the aggregate macro to mark an #[metrics] struct as aggregatable. You will need to define
strategies for each field to describe how multiple items will be merged.
use ;
use GlobalEntrySink;
use ;
use Aggregate;
use Duration;
#
Output: Single metric entry with RequestId: "query-123", Latency: [45ms, 67ms], ResponseSize: 3072
Sink-Level Aggregation
Use WorkerSink or MutexSink when you want to produce aggregated metric entries where the entire entry is aggregated. Both can be combined with KeyedAggregator to perform aggregation against a set of keys or Aggregate when there are no keys. These sinks will be backed by a traditional sink that emits to EMF or other destination.
WorkerSink performs aggregation in a background thread that periodically flushes aggregated data to a backing sink. MutexSink is alternative sink that manages concurrency with a mutex instead of a channel.
use metrics;
use Timer;
use DropGuard;
use ;
use KeyedAggregator;
use WorkerSink;
use Duration;
async
#
# async
async
#
Output: Multiple aggregated entries like ItemType: "email", Priority: 1, ItemsProcessed: 1247, ProcessingTime: [histogram]
Choosing between WorkerSink and MutexSink:
MutexSink- Use when you have inputs from a smaller number of threads. Great for supportingclose_and_mergewith embedded metrics. Currently does not support automatic flushing.WorkerSink- Use for sink-level aggregation from many producers across many threads. The channel-based design reduces contention and provides configurable flush timing.
See the sink_level example for a complete working implementation.
Core Concepts
Field-Level Strategies
Individual fields use aggregation strategies that implement AggregateValue<T>:
Sum- Sums values together (use for counts, totals)Histogram<T>- Collects values into a distribution (use for latency, sizes)KeepLast- Keeps the most recent value (use for gauges, current state)
Entry-Level Aggregation
The aggregate macro generates implementations that define how complete entries are combined. It creates the merge logic, key extraction, and aggregation strategy for your type.
Keys
Fields marked with #[aggregate(key)] become grouping keys. Entries with the same key are merged together when using a
KeyedAggregator.
use metrics;
use ;
use Duration;
#
Calls to the same endpoint will be aggregated together, while different endpoints remain separate.
Aggregation Traits and How They Work Together
The aggregation system is built on several traits that work together:
AggregateValue<T>- Defines how individual field values are merged (Sum, Histogram, KeepLast)Merge- Defines how complete entries are merged together by consuming the sourceMergeRef- LikeMerge, but merges by reference (enablesTeeSinkto send to multiple destinations)Key- Extracts grouping keys from entries to determine which entries should be mergedAggregateStrategy- Ties together the source type, merge behavior, and key extractionAggregateSink<T>- Destination that accepts and aggregates entries
The aggregate macro generates implementations of these traits for your type. For most use cases, you don't need to implement these manually - the macro handles it.
For more detail, see the traits module.
Advanced Usage Patterns
Split Aggregation
When you use #[aggregate(ref)], it makes it possible to send the same record to multiple different sinks. This allows
aggregation by different sets of keys as well as sending the individual, unaggregated record directly to a sink.
You can use TeeSink to aggregate the same data to multiple destinations - useful for combining precise aggregated metrics with sampled individual events. Split aggregation can also allow aggregating the same metric by multiple different sets of dimensions (see the split example).
use KeyedAggregator;
use ;
# use metrics;
# use ;
# use Duration;
// to use multi-sink aggregation, it must be possible to aggregate by reference:
#
This gives you:
- Precise aggregated metrics: Exact counts and distributions
- Raw event samples: Individual events for tracing and debugging
See the split example for a complete working implementation.
Histograms
When aggregating data, a Histogram is often the best way to do it. When you flatten state down into a "gauge" field, such as with KeepLast, you often lose critical information, but a histogram can capture a much richer picture. Histograms collect observations into distributions, allowing you to track percentiles, min, max, and other statistical properties. Histograms can be used with #[aggregate] or embedded directly in your metrics.
Basic Usage
use metrics;
use Histogram;
use Millisecond;
use Duration;
#
Aggregation Strategies
Histograms support different bucketing strategies:
ExponentialAggregationStrategy(default) - Exponential bucketing with ~6.25% error, memory efficientSortAndMerge- Stores all observations exactly for perfect precisionAtomicExponentialAggregationStrategy- Thread-safe exponential bucketing forSharedHistogram. This is the default strategy forSharedHistogram.
use ;
use Duration;
#
#
Thread-Safe Histograms
For concurrent access, use SharedHistogram:
use SharedHistogram;
use Arc;
#
See the histogram example for more usage patterns.