[][src]Crate tracing_timing

Inter-event timing metrics on top of tracing.

This crate provides a tracing::Subscriber that keeps statistics on inter-event timing information. More concretely, given code like this:

use tracing::*;
use tracing_timing::{Builder, Histogram};
let subscriber = Builder::default().build(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatcher = Dispatch::new(subscriber);
dispatcher::with_default(&dispatcher, || {
    trace_span!("request").in_scope(|| {
        // do a little bit of work
        trace!("fast");
        // do a lot of work
        trace!("slow");
    })
});

You can produce something like this (see examples/pretty.rs):

fast:
mean: 173.2µs, p50: 172µs, p90: 262µs, p99: 327µs, p999: 450µs, max: 778µs
  25µs | *                                        |  2.2th %-ile
  50µs | *                                        |  2.2th %-ile
  75µs | *                                        |  4.7th %-ile
 100µs | ***                                      | 11.5th %-ile
 125µs | *****                                    | 24.0th %-ile
 150µs | *******                                  | 41.1th %-ile
 175µs | ********                                 | 59.2th %-ile
 200µs | *******                                  | 75.4th %-ile
 225µs | **                                       | 80.1th %-ile
 250µs | ***                                      | 87.3th %-ile
 275µs | ***                                      | 94.4th %-ile
 300µs | **                                       | 97.8th %-ile

slow:
mean: 623.3µs, p50: 630µs, p90: 696µs, p99: 770µs, p999: 851µs, max: 950µs
 500µs | *                                        |  1.6th %-ile
 525µs | **                                       |  4.8th %-ile
 550µs | ***                                      | 10.9th %-ile
 575µs | *****                                    | 22.2th %-ile
 600µs | *******                                  | 37.9th %-ile
 625µs | ********                                 | 55.9th %-ile
 650µs | *******                                  | 72.9th %-ile
 675µs | ******                                   | 85.6th %-ile
 700µs | ****                                     | 93.5th %-ile
 725µs | **                                       | 97.1th %-ile

When TimingSubscriber is used as the tracing::Dispatch, the time between each event in a span is measured using quanta, and is recorded in "high dynamic range histograms" using hdrhistogram's multi-threaded recording facilities. The recorded timing information is grouped using the SpanGroup and EventGroup traits, allowing you to combine recorded statistics across spans and events.

Extracting timing histograms

The crate does not implement a mechanism for recording the resulting histograms. Instead, you can implement this as you see fit using TimingSubscriber::with_histograms. It gives you access to the histograms for all groups. Note that you must call refresh() on each histogram to see its latest values (see hdrhistogram::SyncHistogram). Note that calling refresh() will block until the next event is posted, so you may want TimingSubscriber::force_synchronize instead.

To access the histograms later, use tracing::Dispatch::downcast_ref. If your type is hard to name, you can use a TimingSubscriber::downcaster instead.

use tracing::*;
use tracing_timing::{Builder, Histogram, TimingSubscriber};
let subscriber = Builder::default().build(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatch = Dispatch::new(subscriber);
// ...
// code that hands off clones of the dispatch, maybe to other threads.
// for example:
tracing::dispatcher::set_global_default(dispatch.clone())
  .expect("setting tracing default failed");
// (note that Dispatch implements Clone,  so you can keep a handle to it!)
//
// then, at some later time, in some other place, you can call:
// ...
dispatch.downcast_ref::<TimingSubscriber>().unwrap().with_histograms(|hs| {
    for (span_group, hs) in hs {
        for (event_group, h) in hs {
            // make sure we see the latest samples:
            h.refresh();
            // print the median:
            println!("{} -> {}: {}ns", span_group, event_group, h.value_at_quantile(0.5))
        }
    }
});

See the documentation for hdrhistogram for more on what you can do once you have the histograms.

Using the Layer API

To use the Layer API from tracing-subscriber, you first need to enable the layer feature. Then, use [Builder::layer] to construct a layer that you can mix in with other layers:

use tracing_subscriber::{registry::Registry, Layer};
let timing_layer = Builder::default()
    .layer(|| Histogram::new_with_max(1_000_000, 2).unwrap());
let dispatch = Dispatch::new(timing_layer.with_subscriber(Registry::default()));

Grouping samples

By default, TimingSubscriber groups samples by the "name" of the containing span and the "message" of the relevant event. These are the first string parameter you pass to each of the relevant tracing macros. You can override this behavior either by providing your own implementation of SpanGroup and EventGroup to Builder::spans and Builder::events respectively. There are also a number of pre-defined "groupers" in the group module that cover the most common cases.

Interpreting the output

To understand the output you get from tracing-timing, a more complicated example might help. Consider the following tracing code:

loop {
    trace_span!("foo").in_scope(|| {
        trace!("foo_start");
        std::thread::sleep(std::time::Duration::from_millis(1));
        trace_span!("bar").in_scope(|| {
            trace!("bar_start");
            std::thread::sleep(std::time::Duration::from_millis(1));
            if rand() > 0.5 {
                trace!("bar_mid1");
                std::thread::sleep(std::time::Duration::from_millis(1));
            } else {
                trace!("bar_mid2");
                std::thread::sleep(std::time::Duration::from_millis(2));
            }
            trace!("bar_end");
        });
        trace!("foo_end");
    })
}

What histogram data would you expect to see for each event?

Well, let's walk through it. There are two span groups: "foo" and "bar". "bar" will contain entries for the events "bar_start", the "bar_mid"s, and "bar_end". "foo" will contain entries for the events "foo_start", "foo_end", and all the "bar" events by default (see Builder::no_span_recursion). Let's look at each of those in turn:

  • "foo_start" is easy: it tracks the time since "foo" was created.
  • "foo_end" is mostly easy: it tracks the time since "bar_end". If span recursion is disabled, it instead tracks the time since "foo_start".
  • "bar_start" is trickier: in "bar", it tracks the time since "bar" was entered. in "foo", it contains the time since "foo_start".
  • the "bar_mid"s are easy: they both track the time since "bar_start".
  • "bar_end" is tricky: it tracks the time since whichever event of "bar_mid1" and "bar_mid2" happened! The histogram will show a bi-modal distribution with one peak at 1ms, and one peak at 2ms. If you want to be able to distinguish these, you will have to insert additional tracing events inside the branches.

Timing information over time

Every time you refresh a histogram, it incorporates any new timing samples recorded since the last call to refresh, allowing you to see timing metrics across all time. If you are monitoring the health of a continuously running system, you may instead wish to only see metrics across a limited window of time. You can do this by clearing the histogram in TimingSubscriber::with_histograms before refreshing them, or periodically as you see fit.

Usage notes:

Event timing is per span, not per span per thread. This means that if you emit events for the same span concurrently from multiple threads, you may see weird timing information.

Span creation takes a lock. This means that you will generally want to avoid creating extraneous spans. One technique that works well here is subsampling your application, for example by only creating tracking spans for some of your requests.

Modules

group

Commonly used span and event grouping mechanisms.

Structs

Builder

Builder for TimingSubscriber instances.

Histogram

Histogram is the core data structure in HdrSample. It records values, and performs analytics.

SubscriberDowncaster

A convenience type for getting access to TimingSubscriber through a Dispatch.

TimingSubscriber

Timing-gathering tracing subscriber.

Traits

EventGroup

Translate attributes from a tracing event into a timing event group.

SpanGroup

Translate attributes from a tracing span into a timing span group.

Type Definitions

HashMap

A standard library HashMap with a faster hasher.

Hasher

A faster hasher for tracing-timing maps.