async_openai/middleware/mod.rs
1//! Tower based middlewares
2//!
3//! Enable the `middleware` feature to customize the HTTP execution path with
4//! Tower services and layers.
5//!
6//! The middleware boundary is intentionally below the API groups and above the
7//! concrete HTTP transport, an example middleware stack:
8//!
9//! ```text
10//! async-openai API groups
11//! responses(), chat(), files(), ...
12//! |
13//! v
14//! HttpRequestFactory
15//! |
16//! v
17//! +----- concurrency_limit ------+
18//! | +------- timeout ----------+ |
19//! | | +-- OpenAIRetryLayer --+ | |
20//! | | | | | |
21//! | | | ReqwestService or | | |
22//! | | | custom service | | |
23//! | | | | | |
24//! | | +-- OpenAIRetryLayer --+ | |
25//! | +------- timeout ----------+ |
26//! +----- concurrency_limit ------+
27//! |
28//! v
29//! reqwest::Response
30//! ```
31//!
32//! The request value passed through tower is [`HttpRequestFactory`], not
33//! `reqwest::Request`. This is deliberate: `reqwest::Request` is not generally
34//! cloneable once it contains a streaming body, but retry middleware needs a
35//! way to replay a request. The factory is cheap to clone and rebuilds a fresh
36//! `reqwest::Request` for each attempt.
37//!
38//! ## Use the Default `ReqwestService`
39//!
40//! [`ReqwestService`] is a tower service backed by `reqwest::Client`.
41//! It is used by default to make outbound HTTP requests.
42//!
43//! ```no_run
44//! # use async_openai::{Client, config::OpenAIConfig};
45//! # use async_openai::middleware::{retry::OpenAIRetryLayer, ReqwestService};
46//! # use std::time::Duration;
47//! let service = tower::ServiceBuilder::new()
48//! .concurrency_limit(8)
49//! .timeout(Duration::from_secs(30))
50//! .layer(OpenAIRetryLayer::default())
51//! .service(ReqwestService::new(reqwest::Client::new()));
52//!
53//! let client = Client::with_config(OpenAIConfig::default())
54//! .with_http_service(service);
55//! ```
56//!
57//! ## Use a Custom Service
58//!
59//! You can replace [`ReqwestService`] entirely. This is useful for logging,
60//! metrics, tests, mocks, alternate transports, or policy layers that want to
61//! inspect the generated request before sending it.
62//!
63//! ```no_run
64//! # use async_openai::{Client, config::OpenAIConfig, error::OpenAIError};
65//! # use async_openai::middleware::HttpRequestFactory;
66//! # use tower::service_fn;
67//! let service = service_fn(|factory: HttpRequestFactory| async move {
68//! let request = factory.build().await?;
69//!
70//! // here you can inspect, modify, or log the request, route it somewhere else,
71//! // or return a synthetic response for testing.
72//!
73//! println!("sending {} {}", request.method(), request.url());
74//!
75//! reqwest::Client::new()
76//! .execute(request)
77//! .await
78//! .map_err(OpenAIError::Reqwest)
79//! });
80//!
81//! let client = Client::with_config(OpenAIConfig::default())
82//! .with_http_service(service);
83//! ```
84//!
85//! ## Retry layer
86//!
87//!
88//! [`retry::OpenAIRetryLayer`] is a Tower layer and [`retry::SimpleRetryPolicy`] is a Tower retry policy.
89//!
90//! Both attempt retries with exponential backoff on `429`, `5xx` and connection errors and respects `Retry-After` header.
91//!
92//! The difference is that upon seeing 429, `OpenAIRetryLayer` consumes response body to check if it is a rate
93//! limit (retryable error) or insufficient quota (permanent error). The default async-openai client uses this layer internally
94//! for library's default retry behavior.
95//!
96//!
97//! The retry boundary is [`HttpRequestFactory`]. Retrying
98//! clones the factory and rebuilds a fresh `reqwest::Request` for each attempt
99//! instead of cloning a built request. That matters because `reqwest::Request` is not Clone.
100//!
101//! [`retry::SimpleRetryPolicy`] uses [`retry::should_retry`] to determine if a request should be retried.
102//!
103//! Custom tower retry policies can call [`retry::should_retry`] to reuse the same
104//! retry classification while changing delay behavior.
105//!
106//! On native targets retries wait using `tokio::time::sleep`. On WASM retries are immediate.
107//!
108//! ## Error Handling
109//!
110//! `OpenAIError::Boxed` is available only when the `middleware` feature is enabled.
111//!
112//! Custom middleware services installed with `Client::with_http_service` may use any error type that implements `Into<OpenAIError>`. This lets middleware preserve structured errors when it has a dedicated `OpenAIError` conversion.
113//!
114//! Tower's `BoxError` converts into `OpenAIError::Boxed`, which is useful for generic tower layers whose concrete error type is erased. Callers can still downcast the boxed error when they know the original error type.
115//!
116//!
117//! ## Bring Your Own Types Interaction
118//!
119//! With the `byot` feature, generated `*_byot` methods keep the same minimal
120//! trait bounds with or without middleware. JSON request bodies are serialized
121//! before they enter the replayable middleware request factory; multipart
122//! request bodies use the client-level replay bounds required by form handling.
123
124/// Retry layers and policies for middleware.
125pub mod retry {
126 #[doc(inline)]
127 pub use crate::retry::*;
128}
129
130pub use crate::executor::{HttpExecutor, HttpRequestFactory, ReqwestService};