1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
//! Tower based middlewares
//!
//! Enable the `middleware` feature to customize the HTTP execution path with
//! Tower services and layers.
//!
//! The middleware boundary is intentionally below the API groups and above the
//! concrete HTTP transport, an example middleware stack:
//!
//! ```text
//! async-openai API groups
//! responses(), chat(), files(), ...
//! |
//! v
//! HttpRequestFactory
//! |
//! v
//! +----- concurrency_limit ------+
//! | +------- timeout ----------+ |
//! | | +-- OpenAIRetryLayer --+ | |
//! | | | | | |
//! | | | ReqwestService or | | |
//! | | | custom service | | |
//! | | | | | |
//! | | +-- OpenAIRetryLayer --+ | |
//! | +------- timeout ----------+ |
//! +----- concurrency_limit ------+
//! |
//! v
//! reqwest::Response
//! ```
//!
//! The request value passed through tower is [`HttpRequestFactory`], not
//! `reqwest::Request`. This is deliberate: `reqwest::Request` is not generally
//! cloneable once it contains a streaming body, but retry middleware needs a
//! way to replay a request. The factory is cheap to clone and rebuilds a fresh
//! `reqwest::Request` for each attempt.
//!
//! ## Use the Default `ReqwestService`
//!
//! [`ReqwestService`] is a tower service backed by `reqwest::Client`.
//! It is used by default to make outbound HTTP requests.
//!
//! ```no_run
//! # use async_openai::{Client, config::OpenAIConfig};
//! # use async_openai::middleware::{retry::OpenAIRetryLayer, ReqwestService};
//! # use std::time::Duration;
//! let service = tower::ServiceBuilder::new()
//! .concurrency_limit(8)
//! .timeout(Duration::from_secs(30))
//! .layer(OpenAIRetryLayer::default())
//! .service(ReqwestService::new(reqwest::Client::new()));
//!
//! let client = Client::with_config(OpenAIConfig::default())
//! .with_http_service(service);
//! ```
//!
//! ## Use a Custom Service
//!
//! You can replace [`ReqwestService`] entirely. This is useful for logging,
//! metrics, tests, mocks, alternate transports, or policy layers that want to
//! inspect the generated request before sending it.
//!
//! ```no_run
//! # use async_openai::{Client, config::OpenAIConfig, error::OpenAIError};
//! # use async_openai::middleware::HttpRequestFactory;
//! # use tower::service_fn;
//! let service = service_fn(|factory: HttpRequestFactory| async move {
//! let request = factory.build().await?;
//!
//! // here you can inspect, modify, or log the request, route it somewhere else,
//! // or return a synthetic response for testing.
//!
//! println!("sending {} {}", request.method(), request.url());
//!
//! reqwest::Client::new()
//! .execute(request)
//! .await
//! .map_err(OpenAIError::Reqwest)
//! });
//!
//! let client = Client::with_config(OpenAIConfig::default())
//! .with_http_service(service);
//! ```
//!
//! ## Retry layer
//!
//!
//! [`retry::OpenAIRetryLayer`] is a Tower layer and [`retry::SimpleRetryPolicy`] is a Tower retry policy.
//!
//! Both attempt retries with exponential backoff on `429`, `5xx` and connection errors and respects `Retry-After` header.
//!
//! The difference is that upon seeing 429, `OpenAIRetryLayer` consumes response body to check if it is a rate
//! limit (retryable error) or insufficient quota (permanent error). The default async-openai client uses this layer internally
//! for library's default retry behavior.
//!
//!
//! The retry boundary is [`HttpRequestFactory`]. Retrying
//! clones the factory and rebuilds a fresh `reqwest::Request` for each attempt
//! instead of cloning a built request. That matters because `reqwest::Request` is not Clone.
//!
//! [`retry::SimpleRetryPolicy`] uses [`retry::should_retry`] to determine if a request should be retried.
//!
//! Custom tower retry policies can call [`retry::should_retry`] to reuse the same
//! retry classification while changing delay behavior.
//!
//! On native targets retries wait using `tokio::time::sleep`. On WASM retries are immediate.
//!
//! ## Native and WASM bounds
//!
//! The conceptual middleware boundary stays the same; only
//! the platform thread-safety bounds differ.
//!
//! On native targets, middleware services installed
//! with `Client::with_http_service` must be `Send + Sync + 'static` and return
//! `Send + 'static` futures.
//!
//! On WASM targets, middleware services and futures must be `'static`.
//!
//! ## Bring Your Own Types Interaction
//!
//! With the `byot` feature, generated `*_byot` methods keep minimal trait bounds.
//! When `middleware` feature is enabled additional [`MiddlewareInput`] bounds are added
//! based on native or WASM targets so the input can be stored long enough to
//! rebuild a fresh request for retries.
//!
//! ## Error Handling
//!
//! `OpenAIError::Boxed` is available only when the `middleware` feature is enabled.
//!
//! Custom middleware services installed with `Client::with_http_service` may use any error type that implements `Into<OpenAIError>`. This lets middleware preserve structured errors when it has a dedicated `OpenAIError` conversion.
//!
//! Tower's `BoxError` converts into `OpenAIError::Boxed`, which is useful for generic tower layers whose concrete error type is erased. Callers can still downcast the boxed error when they know the original error type.
//!
/// Retry layers and policies for middleware.
pub use crate;