tower-resilience-core 0.3.0

Core infrastructure for tower-resilience: events, metrics, and shared utilities
Documentation

tower-resilience

Crates.io Documentation License Rust Version

A comprehensive resilience and fault-tolerance toolkit for Tower services, inspired by Resilience4j.

About

Tower-resilience provides composable middleware for building robust distributed systems in Rust. Tower is a library of modular and reusable components for building robust networking clients and servers. This crate extends Tower with resilience patterns commonly needed in production systems.

Inspired by Resilience4j, a fault tolerance library for Java, tower-resilience adapts these battle-tested patterns to Rust's async ecosystem and Tower's middleware model.

Resilience Patterns

  • Circuit Breaker - Prevents cascading failures by stopping calls to failing services
  • Bulkhead - Isolates resources to prevent system-wide failures
  • Time Limiter - Advanced timeout handling with cancellation support
  • Retry - Intelligent retry with exponential backoff and jitter
  • Rate Limiter - Controls request rate to protect services
  • Cache - Response memoization to reduce load

Features

  • Composable - Stack multiple resilience patterns using Tower's ServiceBuilder
  • Observable - Event system for monitoring pattern behavior (retries, state changes, etc.)
  • Configurable - Builder APIs with sensible defaults
  • Async-first - Built on tokio for async Rust applications
  • Zero-cost abstractions - Minimal overhead when patterns aren't triggered

Quick Start

[dependencies]
tower-resilience = "0.1"
tower = "0.5"
use tower::ServiceBuilder;
use tower_resilience::prelude::*;

let service = ServiceBuilder::new()
    .layer(CircuitBreakerLayer::builder()
        .failure_rate_threshold(0.5)
        .build())
    .layer(BulkheadLayer::builder()
        .max_concurrent_calls(10)
        .build())
    .service(my_service);

Examples

Circuit Breaker

Prevent cascading failures by opening the circuit when error rate exceeds threshold:

use tower_resilience_circuitbreaker::CircuitBreakerLayer;

let layer = CircuitBreakerLayer::builder()
    .failure_rate_threshold(0.5)  // Open at 50% failure rate
    .sliding_window_size(100)      // Track last 100 calls
    .build();

See examples/circuitbreaker.rs for a complete example.

Bulkhead

Limit concurrent requests to prevent resource exhaustion:

use tower_resilience_bulkhead::BulkheadLayer;

let layer = BulkheadLayer::builder()
    .max_concurrent_calls(10)
    .wait_timeout(Duration::from_secs(5))
    .build();

See examples/bulkhead.rs for a complete example.

Time Limiter

Enforce timeouts on operations:

use tower_resilience_timelimiter::TimeLimiterConfig;

let layer = TimeLimiterConfig::builder()
    .timeout_duration(Duration::from_secs(30))
    .cancel_running_future(true)
    .build();

Retry

Retry failed requests with exponential backoff:

use tower_resilience_retry::RetryConfig;

let layer = RetryConfig::<MyError>::builder()
    .max_attempts(5)
    .exponential_backoff(Duration::from_millis(100))
    .build();

Rate Limiter

Control request rate to protect downstream services:

use tower_resilience_ratelimiter::RateLimiterConfig;

let layer = RateLimiterConfig::builder()
    .max_permits(100)
    .refresh_period(Duration::from_secs(1))
    .build();

Cache

Cache responses to reduce load on expensive operations:

use tower_resilience_cache::CacheConfig;

let layer = CacheConfig::builder()
    .max_size(1000)
    .ttl(Duration::from_secs(300))
    .key_extractor(|req: &Request| req.id.clone())
    .build();

Error Handling

Zero-Boilerplate with ResilienceError

When composing multiple resilience layers, use ResilienceError<E> to eliminate manual error conversion code:

use tower_resilience_core::ResilienceError;

// Your application error
#[derive(Debug)]
enum AppError {
    DatabaseDown,
    InvalidRequest,
}

// That's it! No From implementations needed
type ServiceError = ResilienceError<AppError>;

// All resilience layer errors automatically convert
let service = ServiceBuilder::new()
    .layer(timeout_layer)
    .layer(circuit_breaker)
    .layer(bulkhead)
    .service(my_service);

Benefits:

  • Zero boilerplate - no From trait implementations
  • Rich error context (layer names, counts, durations)
  • Convenient helpers: is_timeout(), is_rate_limited(), etc.

See the Layer Composition Guide for details.

Manual Error Handling

For specific use cases, you can still implement custom error types with manual From conversions. See examples for both approaches.

Pattern Composition

Stack multiple patterns for comprehensive resilience:

use tower::ServiceBuilder;

// Client-side: timeout -> circuit breaker -> retry
let client = ServiceBuilder::new()
    .layer(timeout_layer)
    .layer(circuit_breaker_layer)
    .layer(retry_layer)
    .service(http_client);

// Server-side: rate limit -> bulkhead -> timeout
let server = ServiceBuilder::new()
    .layer(rate_limiter_layer)
    .layer(bulkhead_layer)
    .layer(timeout_layer)
    .service(handler);

Performance

Benchmarks measure the overhead of each pattern in the happy path (no failures, circuit closed, permits available):

Pattern Overhead (ns) vs Baseline
Baseline (no middleware) ~10 ns 1.0x
Retry (no retries) ~80-100 ns ~8-10x
Time Limiter ~107 ns ~10x
Rate Limiter ~124 ns ~12x
Bulkhead ~162 ns ~16x
Cache (hit) ~250 ns ~25x
Circuit Breaker (closed) ~298 ns ~29x
Circuit Breaker + Bulkhead ~413 ns ~40x

Key Takeaways:

  • All patterns add < 300ns overhead individually
  • Overhead is additive when composing patterns
  • Even the heaviest pattern (circuit breaker) is negligible for most use cases
  • Retry and time limiter are the lightest weight options

Run benchmarks yourself:

cargo bench --bench happy_path_overhead

Documentation

Examples

Two sets of examples are provided:

  • Top-level examples - Simple, getting-started examples matching this README (one per pattern)
  • Module examples - Detailed examples in each crate's examples/ directory showing advanced features

Run top-level examples with:

cargo run --example circuitbreaker
cargo run --example bulkhead
cargo run --example retry
# etc.

Why tower-resilience?

Tower provides some built-in resilience (timeout, retry, rate limiting), but tower-resilience offers:

  • Circuit Breaker - Not available in Tower
  • Advanced retry - More backoff strategies and better control
  • Bulkhead - True resource isolation with async-aware semaphores
  • Unified events - Consistent observability across all patterns
  • Builder APIs - Ergonomic configuration with sensible defaults
  • Production-ready - Patterns inspired by battle-tested Resilience4j

Minimum Supported Rust Version (MSRV)

This crate's MSRV is 1.64.0, matching Tower's MSRV policy.

We follow Tower's approach:

  • MSRV bumps are not considered breaking changes
  • When increasing MSRV, the new version must have been released at least 6 months ago
  • MSRV is tested in CI to prevent unintentional increases

License

Licensed under either of:

at your option.

Contributing

Contributions are welcome! Please see the contributing guidelines for more information.