request-shadow
Async request mirroring with sampling, divergence detection, and structured response diffs. The SRE primitive for migrations: send the same request to the production service AND a candidate, compare the responses, return the production one to the client while you collect divergence telemetry.
use Arc;
use ;
# use async_trait;
# ;
#
#
# async
Why a small crate
Every service mesh has a knob for traffic shadowing — Linkerd, Istio, AWS App Mesh. They're great when you own the mesh. They're useless when the migration is in-process: binary library swap, codec change, JSON-vs-protobuf swap, ORM cutover.
This crate gives you the same shape as a 30-line Tokio task:
- Backend trait — abstracts the call. Implement once per transport.
- Shadower — fires both legs concurrently, returns the primary record + an optional divergence.
- Divergence — typed diff: status / headers / body each get their own bool + summary.
- Sampling — sticky on the input bytes (SHA-256 mod 100). The same input always gets the same yes/no for a given rate.
- Timeout for the shadow leg only — never blocks the primary call.
Pieces
| Type | Purpose |
|---|---|
Backend |
async fn call(&self, input: &[u8]) -> Result<ResponseRecord, _>. Implement over reqwest::Client, your gRPC client, or anything else. |
ResponseRecord |
Backend output: ok, status, sorted headers, opaque body. |
ShadowConfig |
Sampling rate, shadow timeout, list of fields to ignore in the diff. |
Shadower |
The composer. Cheap to clone (both backends are Arc). |
ShadowOutcome |
What Shadower::call returns: primary, optional shadow, optional divergence, plus reason flags. |
Divergence |
status: Option<(u16, u16)>, headers: Option<HeaderDiff>, body: Option<BodyDiff>. Each piece is None when that aspect matches or was ignored. |
DivergenceLog |
Bounded ring buffer of recent divergences for operator inspection. |
Sampling
Set sample_rate(N) to mirror N% of requests. Bucketing is sticky over the input bytes:
use ShadowConfig;
let cfg = full_sample.sample_rate;
assert_eq!;
Same key, same answer. Deterministic. No RNG dep.
Composes with
- reliability-toolkit-rs — wrap the shadow
Backendin aCircuitBreakerso a flaky candidate never bleeds into the primary path. - slo-budget-tracker — record every divergence against an SLO so you can answer "is the candidate good enough to promote?"
- feature-flag-rs — flip the sampling rate from a remote config push without redeploying.
Run the example
Builds a primary "v1" backend and a "v2" candidate, fires three requests, prints the primary body + the structured divergence each time.
Bench
The bundled bench times Divergence::compare on a 4KB equal body so you can spot regressions in the diff path.
Tests
CI matrix: stable, beta, 1.86.0 (MSRV). Eleven async tests cover identical responses, body/status/header divergence, ignore-fields, sampling at 0%, sticky sampling, timeout handling, shadow-backend failures, and the divergence log.
License
MIT. See LICENSE.