Expand description
§inference-runtime-openai
OpenAI Chat Completions runtime + Azure OpenAI variant. Doc §10.3.
Implements the inference_core::ModelRunner contract over
HTTP/2 (via reqwest). SSE chunks are parsed by
inference-remote-core::sse and lifted into provider-typed deltas
by wire.
Re-exports§
pub use config::OpenAiConfig;pub use config::OpenAiVariant;pub use cost::OpenAiPricing;pub use error::classify_openai_error;pub use runner::OpenAiRunner;
Modules§
- config
OpenAiConfig— parameters for the OpenAI / Azure OpenAI runtime.- cost
- Per-million-token pricing for the OpenAI catalog. Values are public
list pricing as of the doc snapshot; operators override via
inference-cliflags. - error
- OpenAI-specific error refinement. Layers on top of the generic
classify_http_statusfrominference-remote-coreto recognise provider-specific shapes that should not be retried (content filter refusals, context-length-exceeded). - runner
OpenAiRunner—ModelRunnerimpl for OpenAI Chat Completions / Azure OpenAI.- wire
- OpenAI Chat Completions wire types — request envelope + SSE chunk shape. Kept minimal: only what’s needed to round-trip the prompts / deltas / usage info that the actor system needs.