Module llm_mock

Expand description

Mock LLM endpoint (OpenAI/Anthropic-compatible) for agent testing (#912). Mock LLM endpoint (#912, #915).

Serves OpenAI-compatible (POST /v1/chat/completions, GET /v1/models) and Anthropic-compatible (POST /v1/messages) endpoints so an agent (Cursor, Claude Code, ChatGPT clients, custom agents) can point its base URL at MockForge and receive correctly-shaped completions with realistic envelopes (ids, usage token counts, finish_reason / stop_reason) and SSE streaming when the caller sets stream: true.

Four modes ([LlmMockMode]), all opt-in via --llm-mock-mode; the default stays a pure offline mock:

mock (default): canned/templated text, never calls out. Deterministic.
proxy: forward every request to a configured OpenAI/Anthropic-compatible upstream and return the real response (a man-in-the-middle for agent<->LLM traffic; combine with --latency/--failures for chaos on real traffic).
record: on a cassette miss, forward to upstream and save the response; on a hit, replay from the cassette. Real content, deterministic after warm-up.
replay: serve only from the cassette (fully offline); a miss falls back to the canned reply.

Any request the caller sends is already in the upstream’s wire shape, so upstream calls forward the model + messages verbatim (always non-streaming); streaming clients get the resolved text re-chunked locally.

Mounted by mockforge serve --llm-mock.

Structs§

LlmMockConfig: Runtime configuration for the mock LLM endpoint.
LlmMockState: Router state: config plus the shared cassette and an HTTP client for upstream calls. Cheap to clone (Arc + reqwest::Client are handle types).

Enums§

LlmMockMode: How the mock LLM endpoint sources its reply text.

Functions§

router: Build the axum router exposing the mock LLM endpoints.

Module llm_mock

Module llm_mock Copy item path

Structs§

Enums§

Functions§

Module llm_mock