Crate inference_runtime_openai

Expand description

§inference-runtime-openai

OpenAI Chat Completions runtime + Azure OpenAI variant. Doc §10.3.

Implements the inference_core::ModelRunner contract over HTTP/2 (via reqwest). SSE chunks are parsed by inference-remote-core::sse and lifted into provider-typed deltas by wire.

Re-exports§

pub use config::OpenAiConfig;
pub use config::OpenAiVariant;
pub use cost::OpenAiPricing;
pub use error::classify_openai_error;
pub use runner::OpenAiRunner;

Modules§

config: OpenAiConfig — parameters for the OpenAI / Azure OpenAI runtime.
cost: Per-million-token pricing for the OpenAI catalog. Values are public list pricing as of the doc snapshot; operators override via inference-cli flags.
error: OpenAI-specific error refinement. Layers on top of the generic classify_http_status from inference-remote-core to recognise provider-specific shapes that should not be retried (content filter refusals, context-length-exceeded).
runner: OpenAiRunner — ModelRunner impl for OpenAI Chat Completions / Azure OpenAI.
wire: OpenAI Chat Completions wire types — request envelope + SSE chunk shape. Kept minimal: only what’s needed to round-trip the prompts / deltas / usage info that the actor system needs.

Crate inference_runtime_openai

Crate inference_runtime_openai Copy item path

§inference-runtime-openai

Re-exports§

Modules§

Crate inference_runtime_openai