llmposter
A Rust crate + CLI for mocking LLM API endpoints. Fixture-driven, deterministic responses for testing.
Speaks 4 LLM API formats — OpenAI Chat Completions, Anthropic Messages, Gemini generateContent, and OpenAI Responses API — with SSE streaming and failure simulation.
Inspired by llmock. Built in Rust with zero runtime dependencies for users.
Quick Start (Library)
[]
= "0.4"
= { = "1", = ["macros", "rt-multi-thread"] }
= "0.13"
= "1"
use ;
async
Quick Start (CLI)
# Install via Homebrew
# Or install via Cargo
# Create fixtures
# Run server
# Point your app at http://127.0.0.1:8080
Supported Providers
| Route | Provider |
|---|---|
POST /v1/chat/completions |
OpenAI Chat Completions |
POST /v1/messages |
Anthropic Messages |
POST /v1/responses |
OpenAI Responses API |
POST /v1beta/models/{model}:generateContent |
Gemini |
POST /v1beta/models/{model}:streamGenerateContent |
Gemini (streaming) |
GET /code/200 (any 100–599) |
HTTP status echo (mini-httpbin) |
All providers support streaming and non-streaming. For OpenAI, Anthropic, and Responses API, just swap the base URL — the paths are identical to the real APIs. Gemini uses separate endpoints for streaming (streamGenerateContent) and non-streaming (generateContent).
Authentication
Bearer token enforcement on LLM endpoints — off by default, fully backward compatible.
let server = new
.with_bearer_token // valid forever
.with_bearer_token_uses // expires after 1 use
.fixture
.build.await.unwrap;
// Requests must include: Authorization: Bearer test-token-123
OAuth 2.0 Mock Server
Full OAuth server via oauth-mock integration — PKCE, device code, token refresh, revocation.
let server = new
.with_oauth_defaults // spawns OAuth server on separate port
.fixture
.build.await.unwrap;
let oauth_url = server.oauth_url.unwrap; // e.g. http://127.0.0.1:12345
// Point your client's token_url at oauth_url
// Tokens issued by the OAuth server are automatically valid on LLM endpoints
Documentation
- Getting Started — Installation, first fixture, first test
- Fixtures — YAML format, matching rules, tool calls
- Failure Simulation — Error codes, latency, truncation, disconnect
- CLI Reference — Flags, validate mode, verbose logging
- Library API — Rust
ServerBuilder, programmatic fixtures - Spec Deviations — Known gaps from real APIs
Provider Guides
- OpenAI Chat Completions — Fields, streaming, error shapes
- Anthropic Messages — Fields, streaming, error shapes
- Gemini generateContent — Fields, streaming, camelCase
- OpenAI Responses API — Fields, streaming events, envelopes
License
AGPL-3.0