# tower-resilience
[](https://crates.io/crates/tower-resilience)
[](https://docs.rs/tower-resilience)
[](LICENSE-MIT)
[](https://www.rust-lang.org)
A comprehensive resilience and fault-tolerance toolkit for [Tower](https://github.com/tower-rs/tower) services, inspired by [Resilience4j](https://resilience4j.readme.io/).
## About
Tower-resilience provides composable middleware for building robust distributed systems in Rust. [Tower](https://docs.rs/tower) is a library of modular and reusable components for building robust networking clients and servers. This crate extends Tower with resilience patterns commonly needed in production systems.
Inspired by [Resilience4j](https://resilience4j.readme.io/), a fault tolerance library for Java, tower-resilience adapts these battle-tested patterns to Rust's async ecosystem and Tower's middleware model.
## Resilience Patterns
- **Circuit Breaker** - Prevents cascading failures by stopping calls to failing services
- **Bulkhead** - Isolates resources to prevent system-wide failures
- **Time Limiter** - Advanced timeout handling with cancellation support
- **Retry** - Intelligent retry with exponential backoff and jitter
- **Rate Limiter** - Controls request rate to protect services
- **Cache** - Response memoization to reduce load
## Features
- **Composable** - Stack multiple resilience patterns using Tower's ServiceBuilder
- **Observable** - Event system for monitoring pattern behavior (retries, state changes, etc.)
- **Configurable** - Builder APIs with sensible defaults
- **Async-first** - Built on tokio for async Rust applications
- **Zero-cost abstractions** - Minimal overhead when patterns aren't triggered
## Quick Start
```toml
[dependencies]
tower-resilience = "0.1"
tower = "0.5"
```
```rust
use tower::ServiceBuilder;
use tower_resilience::prelude::*;
let service = ServiceBuilder::new()
.layer(CircuitBreakerLayer::builder()
.failure_rate_threshold(0.5)
.build())
.layer(BulkheadLayer::builder()
.max_concurrent_calls(10)
.build())
.service(my_service);
```
## Examples
### Circuit Breaker
Prevent cascading failures by opening the circuit when error rate exceeds threshold:
```rust
use tower_resilience_circuitbreaker::CircuitBreakerLayer;
let layer = CircuitBreakerLayer::builder()
.failure_rate_threshold(0.5) // Open at 50% failure rate
.sliding_window_size(100) // Track last 100 calls
.build();
```
See [examples/circuitbreaker.rs](examples/circuitbreaker.rs) for a complete example.
### Bulkhead
Limit concurrent requests to prevent resource exhaustion:
```rust
use tower_resilience_bulkhead::BulkheadLayer;
let layer = BulkheadLayer::builder()
.max_concurrent_calls(10)
.wait_timeout(Duration::from_secs(5))
.build();
```
See [examples/bulkhead.rs](examples/bulkhead.rs) for a complete example.
### Time Limiter
Enforce timeouts on operations:
```rust
use tower_resilience_timelimiter::TimeLimiterConfig;
let layer = TimeLimiterConfig::builder()
.timeout_duration(Duration::from_secs(30))
.cancel_running_future(true)
.build();
```
### Retry
Retry failed requests with exponential backoff:
```rust
use tower_resilience_retry::RetryConfig;
let layer = RetryConfig::<MyError>::builder()
.max_attempts(5)
.exponential_backoff(Duration::from_millis(100))
.build();
```
### Rate Limiter
Control request rate to protect downstream services:
```rust
use tower_resilience_ratelimiter::RateLimiterConfig;
let layer = RateLimiterConfig::builder()
.max_permits(100)
.refresh_period(Duration::from_secs(1))
.build();
```
### Cache
Cache responses to reduce load on expensive operations:
```rust
use tower_resilience_cache::CacheConfig;
let layer = CacheConfig::builder()
.max_size(1000)
.ttl(Duration::from_secs(300))
.key_extractor(|req: &Request| req.id.clone())
.build();
```
## Error Handling
### Zero-Boilerplate with ResilienceError
When composing multiple resilience layers, use `ResilienceError<E>` to eliminate manual error conversion code:
```rust
use tower_resilience_core::ResilienceError;
// Your application error
#[derive(Debug)]
enum AppError {
DatabaseDown,
InvalidRequest,
}
// That's it! No From implementations needed
type ServiceError = ResilienceError<AppError>;
// All resilience layer errors automatically convert
let service = ServiceBuilder::new()
.layer(timeout_layer)
.layer(circuit_breaker)
.layer(bulkhead)
.service(my_service);
```
**Benefits:**
- Zero boilerplate - no `From` trait implementations
- Rich error context (layer names, counts, durations)
- Convenient helpers: `is_timeout()`, `is_rate_limited()`, etc.
See the [Layer Composition Guide](https://docs.rs/tower-resilience) for details.
### Manual Error Handling
For specific use cases, you can still implement custom error types with manual `From` conversions. See examples for both approaches.
## Pattern Composition
Stack multiple patterns for comprehensive resilience:
```rust
use tower::ServiceBuilder;
// Client-side: timeout -> circuit breaker -> retry
let client = ServiceBuilder::new()
.layer(timeout_layer)
.layer(circuit_breaker_layer)
.layer(retry_layer)
.service(http_client);
// Server-side: rate limit -> bulkhead -> timeout
let server = ServiceBuilder::new()
.layer(rate_limiter_layer)
.layer(bulkhead_layer)
.layer(timeout_layer)
.service(handler);
```
## Performance
Benchmarks measure the overhead of each pattern in the happy path (no failures, circuit closed, permits available):
| Baseline (no middleware) | ~10 ns | 1.0x |
| Retry (no retries) | ~80-100 ns | ~8-10x |
| Time Limiter | ~107 ns | ~10x |
| Rate Limiter | ~124 ns | ~12x |
| Bulkhead | ~162 ns | ~16x |
| Cache (hit) | ~250 ns | ~25x |
| Circuit Breaker (closed) | ~298 ns | ~29x |
| Circuit Breaker + Bulkhead | ~413 ns | ~40x |
**Key Takeaways:**
- All patterns add < 300ns overhead individually
- Overhead is additive when composing patterns
- Even the heaviest pattern (circuit breaker) is negligible for most use cases
- Retry and time limiter are the lightest weight options
Run benchmarks yourself:
```bash
cargo bench --bench happy_path_overhead
```
## Documentation
- [API Documentation](https://docs.rs/tower-resilience)
- [Pattern Guides](https://docs.rs/tower-resilience) - In-depth guides on when and how to use each pattern
### Examples
Two sets of examples are provided:
- **[Top-level examples](examples/)** - Simple, getting-started examples matching this README (one per pattern)
- **Module examples** - Detailed examples in each crate's `examples/` directory showing advanced features
Run top-level examples with:
```bash
cargo run --example circuitbreaker
cargo run --example bulkhead
cargo run --example retry
# etc.
```
## Why tower-resilience?
Tower provides some built-in resilience (timeout, retry, rate limiting), but tower-resilience offers:
- **Circuit Breaker** - Not available in Tower
- **Advanced retry** - More backoff strategies and better control
- **Bulkhead** - True resource isolation with async-aware semaphores
- **Unified events** - Consistent observability across all patterns
- **Builder APIs** - Ergonomic configuration with sensible defaults
- **Production-ready** - Patterns inspired by battle-tested Resilience4j
## Minimum Supported Rust Version (MSRV)
This crate's MSRV is **1.64.0**, matching [Tower's MSRV policy](https://github.com/tower-rs/tower).
We follow Tower's approach:
- MSRV bumps are not considered breaking changes
- When increasing MSRV, the new version must have been released at least 6 months ago
- MSRV is tested in CI to prevent unintentional increases
## License
Licensed under either of:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
## Contributing
Contributions are welcome! Please see the [contributing guidelines](CONTRIBUTING.md) for more information.