Module tower::load

source · []
Available on crate feature load only.
Expand description

Service load measurement

This module provides the Load trait, which allows measuring how loaded a service is. It also provides several wrapper types that measure load in different ways:

  • Constant — Always returns the same constant load value for a service.
  • PendingRequests — Measures load by tracking the number of in-flight requests.
  • PeakEwma — Measures load using a moving average of the peak latency for the service.

In general, you will want to use one of these when using the types in tower::balance which balance services depending on their load. Which load metric to use depends on your exact use-case, but the ones above should get you quite far!

When the discover feature is enabled, wrapper types for Discover that wrap the discovered services with the given load estimator are also provided.

When does a request complete?

For many applications, the request life-cycle is relatively simple: when a service responds to a request, that request is done, and the system can forget about it. However, for some applications, the service may respond to the initial request while other parts of the system are still acting on that request. In such an application, the system load must take these requests into account as well, or risk the system underestimating its own load.

To support these use-cases, the load estimators in this module are parameterized by the TrackCompletion trait, with CompleteOnResponse as the default type. The behavior of CompleteOnResponse is what you would normally expect for a request-response cycle: when the response is produced, the request is considered “finished”, and load goes down. This can be overriden by your own user-defined type to track more complex request completion semantics. See the documentation for completion for more details.


use tower::util::ServiceExt;
use tower::{load::Load, Service};
async fn simple_balance<S1, S2, R>(
    svc1: &mut S1,
    svc2: &mut S2,
    request: R
) -> Result<S1::Response, S1::Error>
    S1: Load + Service<R>,
    S2: Load<Metric = S1::Metric> + Service<R, Response = S1::Response, Error = S1::Error>
    if svc1.load() < svc2.load() {
    } else {


pub use self::completion::CompleteOnResponse;
pub use self::completion::TrackCompletion;
pub use self::peak_ewma::PeakEwma;
pub use self::pending_requests::PendingRequests;
pub use self::peak_ewma::PeakEwmaDiscover;
pub use self::pending_requests::PendingRequestsDiscover;


Application-specific request completion semantics.

A Load implementation that measures load using the PeakEWMA response latency.

A Load implementation that measures load using the number of in-flight requests.


Wraps a type so that it implements Load and returns a constant load metric.


Types that implement this trait can give an estimate of how loaded they are.