Crate tracing_timing
source · [−]Expand description
Inter-event timing metrics on top of tracing
.
This crate provides a tracing::Subscriber
that keeps statistics on inter-event timing
information. More concretely, given code like this:
use tracing::*;
use tracing_timing::{Builder, Histogram};
let subscriber = Builder::default().build(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatcher = Dispatch::new(subscriber);
dispatcher::with_default(&dispatcher, || {
trace_span!("request").in_scope(|| {
// do a little bit of work
trace!("fast");
// do a lot of work
trace!("slow");
})
});
You can produce something like this (see examples/pretty.rs
):
fast:
mean: 173.2µs, p50: 172µs, p90: 262µs, p99: 327µs, p999: 450µs, max: 778µs
25µs | * | 2.2th %-ile
50µs | * | 2.2th %-ile
75µs | * | 4.7th %-ile
100µs | *** | 11.5th %-ile
125µs | ***** | 24.0th %-ile
150µs | ******* | 41.1th %-ile
175µs | ******** | 59.2th %-ile
200µs | ******* | 75.4th %-ile
225µs | ** | 80.1th %-ile
250µs | *** | 87.3th %-ile
275µs | *** | 94.4th %-ile
300µs | ** | 97.8th %-ile
slow:
mean: 623.3µs, p50: 630µs, p90: 696µs, p99: 770µs, p999: 851µs, max: 950µs
500µs | * | 1.6th %-ile
525µs | ** | 4.8th %-ile
550µs | *** | 10.9th %-ile
575µs | ***** | 22.2th %-ile
600µs | ******* | 37.9th %-ile
625µs | ******** | 55.9th %-ile
650µs | ******* | 72.9th %-ile
675µs | ****** | 85.6th %-ile
700µs | **** | 93.5th %-ile
725µs | ** | 97.1th %-ile
When TimingSubscriber
is used as the tracing::Dispatch
, the time between each event in a
span is measured using quanta
, and is recorded in “high dynamic range histograms” using
hdrhistogram
’s multi-threaded recording facilities. The recorded timing information is
grouped using the SpanGroup
and EventGroup
traits, allowing you to combine recorded
statistics across spans and events.
Extracting timing histograms
The crate does not implement a mechanism for recording the resulting histograms. Instead, you
can implement this as you see fit using TimingSubscriber::with_histograms
. It gives you
access to the histograms for all groups. Note that you must call refresh()
on each histogram
to see its latest values (see hdrhistogram::SyncHistogram
). Note that calling refresh()
will block until the next event is posted, so you may want TimingSubscriber::force_synchronize
instead.
To access the histograms later, use tracing::Dispatch::downcast_ref
. If your type is hard to
name, you can use a TimingSubscriber::downcaster
instead.
use tracing::*;
use tracing_timing::{Builder, Histogram, TimingSubscriber};
let subscriber = Builder::default().build(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatch = Dispatch::new(subscriber);
// ...
// code that hands off clones of the dispatch, maybe to other threads.
// for example:
tracing::dispatcher::set_global_default(dispatch.clone())
.expect("setting tracing default failed");
// (note that Dispatch implements Clone, so you can keep a handle to it!)
//
// then, at some later time, in some other place, you can call:
// ...
dispatch.downcast_ref::<TimingSubscriber>().unwrap().with_histograms(|hs| {
for (span_group, hs) in hs {
for (event_group, h) in hs {
// make sure we see the latest samples:
h.refresh();
// print the median:
println!("{} -> {}: {}ns", span_group, event_group, h.value_at_quantile(0.5))
}
}
});
See the documentation for hdrhistogram
for more on what you can do once you have the
histograms.
Using the Layer
API
To use the Layer
API from tracing-subscriber
, you first need to enable the layer
feature.
Then, use Builder::layer
to construct a layer that you can mix in with other layers:
use tracing_subscriber::{registry::Registry, Layer};
let timing_layer = Builder::default()
.layer(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatch = Dispatch::new(timing_layer.with_subscriber(Registry::default()));
Grouping samples
By default, TimingSubscriber
groups samples by the “name” of the containing span and the
“message” of the relevant event. These are the first string parameter you pass to each of the
relevant tracing macros. You can override this behavior either by providing your own
implementation of SpanGroup
and EventGroup
to Builder::spans
and
Builder::events
respectively. There are also a number of pre-defined “groupers” in the
group
module that cover the most common cases.
Interpreting the output
To understand the output you get from tracing-timing
, a more complicated example might help.
Consider the following tracing code:
loop {
trace_span!("foo").in_scope(|| {
trace!("foo_start");
std::thread::sleep(std::time::Duration::from_millis(1));
trace_span!("bar").in_scope(|| {
trace!("bar_start");
std::thread::sleep(std::time::Duration::from_millis(1));
if rand() > 0.5 {
trace!("bar_mid1");
std::thread::sleep(std::time::Duration::from_millis(1));
} else {
trace!("bar_mid2");
std::thread::sleep(std::time::Duration::from_millis(2));
}
trace!("bar_end");
});
trace!("foo_end");
})
}
What histogram data would you expect to see for each event?
Well, let’s walk through it. There are two span groups: “foo” and “bar”. “bar” will contain
entries for the events “bar_start”, the “bar_mid“s, and “bar_end”. “foo” will contain entries
for the events “foo_start”, “foo_end”, and all the “bar” events by default (see
Builder::no_span_recursion
). Let’s look at each of those in turn:
- “foo_start” is easy: it tracks the time since “foo” was created.
- “foo_end” is mostly easy: it tracks the time since “bar_end”. If span recursion is disabled, it instead tracks the time since “foo_start”.
- “bar_start” is trickier: in “bar”, it tracks the time since “bar” was entered. in “foo”, it contains the time since “foo_start”.
- the “bar_mid“s are easy: they both track the time since “bar_start”.
- “bar_end” is tricky: it tracks the time since whichever event of “bar_mid1” and “bar_mid2” happened! The histogram will show a bi-modal distribution with one peak at 1ms, and one peak at 2ms. If you want to be able to distinguish these, you will have to insert additional tracing events inside the branches.
Timing information over time
Every time you refresh a histogram, it incorporates any new timing samples recorded since the
last call to refresh
, allowing you to see timing metrics across all time. If you are
monitoring the health of a continuously running system, you may instead wish to only see
metrics across a limited window of time. You can do this by clearing the histogram in
TimingSubscriber::with_histograms
before refreshing them, or periodically as you see fit.
Usage notes:
Event timing is per span, not per span per thread. This means that if you emit events for the same span concurrently from multiple threads, you may see weird timing information.
Span creation takes a lock. This means that you will generally want to avoid creating extraneous spans. One technique that works well here is subsampling your application, for example by only creating tracking spans for some of your requests.
Modules
Commonly used span and event grouping mechanisms.
Structs
Builder for TimingSubscriber
instances.
Histogram
is the core data structure in HdrSample. It records values, and performs analytics.
A convenience type for getting access to TimingLayer
through a Dispatch
.
A convenience type for getting access to TimingSubscriber
through a Dispatch
.
Timing-gathering tracing layer.
Timing-gathering tracing subscriber.
Traits
Translate attributes from a tracing event into a timing event group.
Translate attributes from a tracing span into a timing span group.