Expand description
Mock LLM endpoint (OpenAI/Anthropic-compatible) for agent testing (#912). Mock LLM endpoint (#912, #915).
Serves OpenAI-compatible (POST /v1/chat/completions, GET /v1/models)
and Anthropic-compatible (POST /v1/messages) endpoints so an agent
(Cursor, Claude Code, ChatGPT clients, custom agents) can point its base
URL at MockForge and receive correctly-shaped completions with realistic
envelopes (ids, usage token counts, finish_reason / stop_reason) and
SSE streaming when the caller sets stream: true.
Four modes ([LlmMockMode]), all opt-in via --llm-mock-mode; the default
stays a pure offline mock:
mock(default): canned/templated text, never calls out. Deterministic.proxy: forward every request to a configured OpenAI/Anthropic-compatible upstream and return the real response (a man-in-the-middle for agent<->LLM traffic; combine with--latency/--failuresfor chaos on real traffic).record: on a cassette miss, forward to upstream and save the response; on a hit, replay from the cassette. Real content, deterministic after warm-up.replay: serve only from the cassette (fully offline); a miss falls back to the canned reply.
Any request the caller sends is already in the upstream’s wire shape, so upstream calls forward the model + messages verbatim (always non-streaming); streaming clients get the resolved text re-chunked locally.
Mounted by mockforge serve --llm-mock.
Structs§
- LlmMock
Config - Runtime configuration for the mock LLM endpoint.
- LlmMock
State - Router state: config plus the shared cassette and an HTTP client for upstream calls. Cheap to clone (Arc + reqwest::Client are handle types).
Enums§
- LlmMock
Mode - How the mock LLM endpoint sources its reply text.
Functions§
- router
- Build the axum router exposing the mock LLM endpoints.