1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
//! Tower based middlewares
//!
//! Enable the `middleware` feature to customize the HTTP execution path with
//! Tower services and layers.
//!
//! The middleware boundary is intentionally below the API groups and above the
//! concrete HTTP transport, an example middleware stack:
//!
//! ```text
//! async-openai API groups
//! responses(), chat(), files(), ...
//! |
//! v
//! HttpRequestFactory
//! |
//! v
//! +----- concurrency_limit ------+
//! | +------- timeout ----------+ |
//! | | +-- OpenAIRetryLayer --+ | |
//! | | | | | |
//! | | | ReqwestService or | | |
//! | | | custom service | | |
//! | | | | | |
//! | | +-- OpenAIRetryLayer --+ | |
//! | +------- timeout ----------+ |
//! +----- concurrency_limit ------+
//! |
//! v
//! reqwest::Response
//! ```
//!
//! The request value passed through tower is [`HttpRequestFactory`], not
//! `reqwest::Request`. This is deliberate: `reqwest::Request` is not generally
//! cloneable once it contains a streaming body, but retry middleware needs a
//! way to replay a request. The factory is cheap to clone and rebuilds a fresh
//! `reqwest::Request` for each attempt.
//!
//! ## Use the Default `ReqwestService`
//!
//! [`ReqwestService`] is a tower service backed by `reqwest::Client`.
//! It is used by default to make outbound HTTP requests.
//!
//! ```no_run
//! # use async_openai::{Client, config::OpenAIConfig};
//! # use async_openai::middleware::{retry::OpenAIRetryLayer, ReqwestService};
//! # use std::time::Duration;
//! let service = tower::ServiceBuilder::new()
//! .concurrency_limit(8)
//! .timeout(Duration::from_secs(30))
//! .layer(OpenAIRetryLayer::default())
//! .service(ReqwestService::new(reqwest::Client::new()));
//!
//! let client = Client::with_config(OpenAIConfig::default())
//! .with_http_service(service);
//! ```
//!
//! ## Use a Custom Service
//!
//! You can replace [`ReqwestService`] entirely. This is useful for logging,
//! metrics, tests, mocks, alternate transports, or policy layers that want to
//! inspect the generated request before sending it.
//!
//! ```no_run
//! # use async_openai::{Client, config::OpenAIConfig, error::OpenAIError};
//! # use async_openai::middleware::HttpRequestFactory;
//! # use tower::service_fn;
//! let service = service_fn(|factory: HttpRequestFactory| async move {
//! let request = factory.build().await?;
//!
//! // here you can inspect, modify, or log the request, route it somewhere else,
//! // or return a synthetic response for testing.
//!
//! println!("sending {} {}", request.method(), request.url());
//!
//! reqwest::Client::new()
//! .execute(request)
//! .await
//! .map_err(OpenAIError::Reqwest)
//! });
//!
//! let client = Client::with_config(OpenAIConfig::default())
//! .with_http_service(service);
//! ```
//!
//! ## Retry layer
//!
//!
//! [`retry::OpenAIRetryLayer`] is a Tower layer and [`retry::SimpleRetryPolicy`] is a Tower retry policy.
//!
//! Both attempt retries with exponential backoff on `429`, `5xx` and connection errors and respects `Retry-After` header.
//!
//! The difference is that upon seeing 429, `OpenAIRetryLayer` consumes response body to check if it is a rate
//! limit (retryable error) or insufficient quota (permanent error). The default async-openai client uses this layer internally
//! for library's default retry behavior.
//!
//!
//! The retry boundary is [`HttpRequestFactory`]. Retrying
//! clones the factory and rebuilds a fresh `reqwest::Request` for each attempt
//! instead of cloning a built request. That matters because `reqwest::Request` is not Clone.
//!
//! [`retry::SimpleRetryPolicy`] uses [`retry::should_retry`] to determine if a request should be retried.
//!
//! Custom tower retry policies can call [`retry::should_retry`] to reuse the same
//! retry classification while changing delay behavior.
//!
//! On native targets retries wait using `tokio::time::sleep`. On WASM retries are immediate.
//!
//! ## Error Handling
//!
//! `OpenAIError::Boxed` is available only when the `middleware` feature is enabled.
//!
//! Custom middleware services installed with `Client::with_http_service` may use any error type that implements `Into<OpenAIError>`. This lets middleware preserve structured errors when it has a dedicated `OpenAIError` conversion.
//!
//! Tower's `BoxError` converts into `OpenAIError::Boxed`, which is useful for generic tower layers whose concrete error type is erased. Callers can still downcast the boxed error when they know the original error type.
//!
//!
//! ## Bring Your Own Types Interaction
//!
//! With the `byot` feature, generated `*_byot` methods keep the same minimal
//! trait bounds with or without middleware. JSON request bodies are serialized
//! before they enter the replayable middleware request factory.
/// Retry layers and policies for middleware.
pub use crate;