Qubit Retry
Qubit Retry is a type-preserving retry toolkit for Rust synchronous and asynchronous operations.
The core API is Retry<E>. A retry policy is bound only to the operation error type E; each run or run_async call introduces its own success type T.
Overview
Qubit Retry is designed for applications that need explicit, observable retry behavior around fallible work. It supports synchronous operations, Tokio-based async operations, and blocking work isolated on worker threads. Policies are configured through a builder or optional qubit-config integration, while lifecycle hooks expose each attempt, failure, retry decision, terminal error, and successful completion.
Use this crate when you need typed retry errors, bounded elapsed-time budgets, retry-after hints, panic-aware worker execution, or retry callbacks that can be implemented as closures or reusable function objects.
Features
- Synchronous retry works without optional features.
- Tokio-backed async retry supports true per-attempt timeouts.
- Blocking operations can use
run_in_workerfor thread-isolated execution, panic capture, timeout waiting, and cooperative cancellation. - Optional
qubit-configintegration reads retry settings from configuration. - Retry callbacks are stored as
rs-functionfunctors, so both closures and custom function objects are supported. AttemptFailure<E>represents one failed attempt:Error(E),Timeout,Panic(AttemptPanic), orExecutor(AttemptExecutorError).RetryError<E>represents the terminal retry-flow error and carriesreason,last_failure, andRetryContext.- Separate elapsed budgets distinguish user operation time from total retry-flow time.
- Lifecycle hooks are explicit:
before_attempt,on_success,on_failure,on_retry, andon_error.
Installation
[]
= "0.9.0"
Enable optional integrations as needed:
[]
= { = "0.9.0", = ["tokio", "config"] }
Optional features:
tokio: enablesRetry::run_asyncand per-attempt async timeouts throughtokio::time::timeout.config: enablesRetryOptions::from_configandRetryConfigValuesfor reading settings fromqubit-config.
The default feature set is empty, so synchronous retry does not pull in tokio or qubit-config.
Basic Sync Retry
use Retry;
use Duration;
Failure Decisions
By default, operation errors are retried until the configured attempt or elapsed-time limits stop the flow. Use retry_if_error for simple error predicates:
use ;
use Duration;
let retry = builder
.max_attempts
.exponential_backoff
.retry_if_error
.build?;
Use on_failure when a decision needs the failure kind, attempt timeout, retry-after hint, or any other RetryContext value:
use ;
use Duration;
let retry = builder
.max_attempts
.fixed_delay
.on_failure
.build?;
AttemptFailureDecision::UseDefault hands control back to the retry policy, which then applies the configured limits, delay strategy, jitter, and optional retry-after hint.
Async Retry and Timeout
Async execution requires the tokio feature. Per-attempt timeouts are stored in RetryOptions through the builder. When an attempt times out, the executor reports AttemptFailure::Timeout, and listeners can inspect the configured timeout through RetryContext::attempt_timeout(). Operation panics still unwind through the current async task; run_async() does not convert them to AttemptFailure::Panic.
use Retry;
use Duration;
async
async
Plain run() keeps normal same-thread synchronous execution. It is the lowest-overhead path and works well for short, high-frequency operations such as CAS loops. run() does not support configured per-attempt timeouts: it returns RetryErrorReason::UnsupportedOperation when attempt_timeout is set. Use run_async() for cancellable async futures, or run_in_worker() when blocking work must run on a worker thread.
Elapsed Budgets
Retry elapsed budgets are measured with monotonic Instant time, not wall-clock time:
max_operation_elapsed: cumulative time spent executing user operation attempts. Retry sleeps, retry-after sleeps, and listener time are excluded.max_total_elapsed: total retry-flow time. Operation attempts, retry sleeps, retry-after sleeps, retry hint extraction,on_before_attempt,on_failure, andon_retrytime are included.
Terminal listeners keep notification semantics. on_success and on_error can add caller-visible latency, but they do not turn an already successful operation into a retry failure.
Async and worker-thread attempts use the shortest of configured attempt_timeout, remaining max_operation_elapsed, and remaining max_total_elapsed as the effective attempt timeout. If the selected retry or retry-after delay would consume the remaining max_total_elapsed budget, the flow fails with RetryErrorReason::MaxTotalElapsedExceeded before sleeping. Retry sleeps are not truncated.
Worker-Thread Retry
run_in_worker() runs every attempt on a worker thread. Without an attempt timeout, the caller waits for the worker result and worker panics are captured as AttemptFailure::Panic. Worker-spawn failures are reported as AttemptFailure::Executor. With an attempt timeout, the retry executor stops waiting when the timeout expires, marks the attempt token as cancelled, and waits up to worker_cancel_grace (default 100ms) for the worker to exit before applying the configured AttemptTimeoutPolicy.
Rust cannot safely kill a running thread, so a timed-out worker may keep running unless the operation checks the token and returns. If the worker is still running after the cancellation grace period, the retry flow stops with RetryErrorReason::WorkerStillRunning instead of starting another worker; RetryContext::unreaped_worker_count() reports the unreaped worker count. Use this path for blocking IO, third-party calls, code that may panic, or work that needs per-attempt timeout isolation. Prefer plain run() for low-latency in-memory work.
use ;
use Duration;
let retry = builder
.max_attempts
.fixed_delay
.attempt_timeout
.worker_cancel_grace
.abort_on_timeout
.build?;
let response = retry.run_in_worker?;
run_blocking_with_timeout() remains available as a compatibility alias for run_in_worker().
Retry-After Hints
If an attempt failure carries retry-after information, register a hint extractor with retry_after_hint. The extractor returns Option<Duration>: Some(delay) means "wait this long before the next retry", while None means "no hint is available". When all failure listeners return UseDefault, the default policy uses Some(delay); otherwise it falls back to the configured delay strategy.
use ;
use Duration;
let retry = builder
.max_attempts
.fixed_delay
.retry_after_hint
.build?;
When the hint depends only on the operation error, retry_after_from_error provides a shorter wrapper around retry_after_hint:
let retry = builder
.max_attempts
.fixed_delay
.retry_after_from_error
.build?;
Listeners can also read the extracted value from RetryContext::retry_after_hint().
Listeners
Listeners are lifecycle hooks, not a separate policy system:
before_attempt: invoked before the operation runs for each attempt (including the first). Use it to mark the start of attempt N; the current attempt has not started yet, so this is not the “we failed and are about to back off” moment.on_success: invoked after each successful attempt.on_failure: invoked after eachAttemptFailureand returnsAttemptFailureDecision. Runs before the inter-attempt delay is chosen and beforeon_retry, and can influence abort vs retry and how the policy picks the next delay.on_retry: invoked only after a failed attempt will be retried and the delay before the nextbefore_attempthas been selected (afteron_failure/ merged decisions); before the executor sleeps and before the nextbefore_attempt. It is observational (cannot change backoff/retry);RetryContext::next_delay()is the sleep duration. If the flow will not retry (attempts or time budget exhausted, listener abort, etc.),on_retryis not called.on_error: invoked once when the retry flow returns a terminalRetryError.
before_attempt vs on_retry in one line: before_attempt fires at the start of an attempt; on_retry fires right after a failure once a retry is scheduled and the next delay is known, but before the sleep and the next attempt.
use ;
let retry = builder
.max_attempts
.before_attempt
.on_success
.on_failure
.on_retry
.on_error
.build?;
Configuration
RetryOptions is an immutable configuration snapshot. Reading from qubit-config requires the config feature and happens during construction.
use Config;
use ;
let mut config = new;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
config.set?;
let options = from_config?;
let retry = from_options?;
Supported relative keys:
max_attemptsmax_operation_elapsed_millismax_operation_elapsed_unlimitedmax_total_elapsed_millismax_total_elapsed_unlimitedattempt_timeout_millisattempt_timeout_policy:retryorabortworker_cancel_grace_millisdelay:none,fixed,random,exponential, orexponential_backofffixed_delay_millisrandom_min_delay_millisrandom_max_delay_millisexponential_initial_delay_millisexponential_max_delay_millisexponential_multiplierjitter_factor
Error Handling
Use RetryError::reason(), RetryError::last_failure(), and RetryError::context() to distinguish the terminal cause from the last failed attempt:
use ;
let retry = builder
.max_attempts
.build?;
match retry.run
Documentation
- API documentation: docs.rs/qubit-retry
- Crate package: crates.io/crates/qubit-retry
- Source repository: github.com/qubit-ltd/rs-retry
- Coverage guide: COVERAGE.md
Testing
A minimal local run:
To mirror what continuous integration enforces, run the repository scripts from the project root:
./align-ci.sh formats code and applies local Clippy fixes so the branch follows CI rules. ./ci-check.sh runs the CI-equivalent pipeline, including formatting checks, Clippy with warnings denied, debug and release builds, all-feature tests, rustdoc with warnings denied, JSON coverage threshold checks, and the security audit. ./coverage.sh generates coverage reports; use ./coverage.sh help for output formats such as HTML, text, LCOV, JSON, Cobertura, or all formats.
Contributing
Issues and pull requests are welcome.
- Open an issue for bug reports, design questions, or larger feature proposals when it helps align on direction.
- Keep pull requests scoped to one behavior change, fix, or documentation update when practical.
- Code contributions must run
./align-ci.sh, pass./ci-check.sh, and review coverage with./coverage.shbefore submission. - Add or update tests when you change runtime behavior.
- Update this README or public rustdoc when user-visible API behavior changes.
By contributing, you agree to license your contributions under the Apache License, Version 2.0, the same license as this project.
License
Copyright © 2026 Haixing Hu, Qubit Co. Ltd.
This project is licensed under the Apache License, Version 2.0. See the LICENSE file in the repository for the full text.
Author
Haixing Hu — Qubit Co. Ltd.
| Repository | github.com/qubit-ltd/rs-retry |
| Documentation | docs.rs/qubit-retry |
| Crate | crates.io/crates/qubit-retry |