chat-completions
Generic OpenAI Chat Completions wire client for chat-rs. Targets the /v1/chat/completions wire spec — point it at any OAI-compatible server (vLLM, llama.cpp, LiteLLM, Together, Fireworks, your own gateway).
Install
[]
= "0.4.0"
= "0.2.4"
= { = "1", = ["macros", "rt-multi-thread"] }
Or via the umbrella crate: chat-rs = { version = "0.5.0", features = ["completions"] }.
Usage
use ChatCompletionsBuilder;
use ;
let client = new
.with_base_url
.with_model
.with_api_key // optional — omit for servers that don't require auth
.build;
let mut chat = new.with_model.build;
let mut msgs = from_user;
let response = chat.complete.await?;
Bring your own base URL, model name, and (optional) API key.
Capabilities
- Completions — text generation with tool calling and structured output
- Streaming — token-by-token output (requires
streamfeature) - Embeddings — vector embeddings (where the server supports
/embeddings)
When to Use This vs a Dedicated Wrapper
Use chat-completions directly when you control the endpoint or want maximum flexibility. Use a dedicated wrapper (chat-ollama, chat-huggingface, chat-cerebras, chat-deepseek) when you want preset URLs, env var conventions, and provider-specific niceties.
Custom Transport
Supply a custom transport via .with_transport() to use something other than the default HTTP (reqwest):
let client = new
.with_base_url
.with_model
.with_transport
.build;
Feature Flags
Streaming is gated on the stream feature:
= { = "0.2.3", = ["stream"] }