tower-acc
Adaptive concurrency control for Tower services.
tower-acc dynamically adjusts the number of in-flight requests a service is
allowed to handle, based on observed latency. Instead of picking a fixed
concurrency limit and hoping it's right, it continuously measures round-trip
times and converges on the optimal limit automatically — increasing it when
latency is low and decreasing it when queuing is detected.
Why not a static limit?
Tower ships with ConcurrencyLimit, which caps concurrency at a
value you choose at startup. That works when the capacity of the downstream
service is known and stable, but in practice:
- Backends scale up and down.
- Dependency latency varies with load.
- The "right" limit depends on conditions you can't predict at deploy time.
Setting the limit too low wastes capacity; setting it too high causes queuing,
tail-latency spikes, and cascading failures under load. tower-acc removes the
guesswork by adapting the limit at runtime.
Algorithms
Three built-in algorithms are provided. All are configurable through builder
APIs and implement the Algorithm trait.
AIMD
A loss-based algorithm (like TCP Reno). Increases the limit by 1 on each successful response and multiplies by a backoff ratio on errors or timeouts. Simple and predictable, but only reacts to failures — not to latency changes.
use ;
let layer = new;
Gradient2
Gradient-based algorithm inspired by Netflix's concurrency-limits library. Compares long-term (exponentially smoothed) RTT against short-term RTT to detect queueing. A configurable tolerance ratio allows moderate latency increases without reducing the limit, making it more robust to natural variance than Vegas.
use ;
let layer = new;
Vegas
Inspired by the TCP Vegas congestion control scheme. Tracks the minimum observed RTT (the "no-load" baseline) and estimates queue depth from the ratio of current RTT to baseline:
- Estimate queue depth —
limit × (1 − rtt_noload / rtt). - If the queue is short (below alpha) — increase the limit.
- If the queue is long (above beta) — decrease the limit.
- On errors — decrease immediately.
- Periodically probe — reset the baseline to track changing conditions.
use ;
let layer = new;
Usage
As a Tower layer
use ServiceBuilder;
use ;
let service = new
.layer
.service;
Wrapping a service directly
use ;
let service = new;
Custom algorithms
Implement the Algorithm trait to bring your own strategy:
use Duration;
use Algorithm;
Simulator
The tower-acc-sim crate provides an interactive web-based
simulator for exploring how the algorithms behave under changing server
conditions. See the simulator README for details.
Inspiration
This crate is a Rust/Tower port of the ideas from Netflix's concurrency-limits library and the accompanying blog post Performance Under Load. The core insight — applying TCP congestion control theory to request-level concurrency — comes directly from that work.
License
Licensed under the Apache License, Version 2.0. See LICENSE for details.