Expand description
Connection retry, backoff, and timeouts (PRD §5.8.7, M18).
Three pieces:
RetryPolicy— caller-tunable knobs: attempt count, base / factor / cap on exponential backoff, max wall-clock window, per-attempt connect timeout. Mirrors OpenSSH’sConnectionAttempts+ConnectTimeoutsemantics with a Gitway-specific cap on total elapsed time.classify/Disposition— FR-82’s transient-vs-fatal error classifier. Network noise (ECONNREFUSED, ETIMEDOUT, EHOSTUNREACH, DNS NXDOMAIN) isRetry; everything else (auth failure, host-key mismatch, protocol error, signing error) isFatal.run— the loop driver. Calls the supplied async op, sleeps with jittered exponential backoff between attempts, captures aRetryAttempthistory for FR-83’s--test --jsonenvelope, and emits atracing::warn!event atcrate::log::CAT_RETRYper failed attempt.
§Trust model
run is timeout-agnostic — its job is the loop + classifier +
jitter + history. The per-attempt tokio::time::timeout wrap
lives at the call site (currently session.rs::connect) so the
same loop driver can be reused for non-network operations
(agent reconnects, key-load retries) without forcing every
caller to think about timeouts.
§Why russh-handshake failures are NOT retried
Once the TCP socket is up, any failure is either a fatal
user-input error (auth rejected, host-key mismatch) or an
in-flight protocol error mid-handshake. Re-driving an in-flight
handshake is unsafe (the server may have already consumed our
key-exchange contribution) and the failure modes are server-side
— surfacing them clearly is more useful than silently retrying.
classify returns Fatal for every russh::Error variant
for this reason.
Structs§
- Retry
Attempt - One failed attempt’s record, captured during
runfor surfacing viacrate::session::AnvilSession::retry_historyandgitway --test --json’sdata.retry_attemptsenvelope. - Retry
Policy - Caller-tunable retry knobs (PRD §5.8.7 FR-80, FR-81).
Enums§
- Disposition
- What
runshould do with anAnvilErrorfrom a single attempt.
Functions§
- classify
- Classifies an
AnvilErroras transient or fatal per FR-82. - run
- Drives the retry loop for the supplied async operation.