agentkit-provider-vllm
vLLM model adapter for the agentkit agent loop.
This crate provides VllmAdapter and VllmConfig for connecting the agent
loop to a vLLM server via its OpenAI-compatible chat
completions endpoint. It handles request translation and response normalization
for vLLM-backed sessions. Streaming is enabled by default; use
.with_streaming(false) to force the buffered response path.
An API key is optional — vLLM servers can run with or without authentication
(controlled by the --api-key flag when starting the server).
Configuration
Create a config with VllmConfig::new(model) and chain .with_*() builders for optional parameters. Alternatively, VllmConfig::from_env() reads from environment variables:
| Variable | Required | Default |
|---|---|---|
VLLM_MODEL |
yes | -- |
VLLM_BASE_URL |
no | http://localhost:8000/v1/chat/completions |
VLLM_API_KEY |
no | -- |
Examples
Minimal chat agent
use ;
use ;
#
# async
Authenticated vLLM server
use ;
#
Remote server with environment-based configuration
use ;
#