Module llm_command

Expand description

Multi-provider LLM client for orcs.llm().

§Architecture

Two provider families, one unified HTTP transport (reqwest):

Lua: orcs.llm(prompt, opts)
  → Capability::LLM gate (ctx_fns / child)
  → llm_request_impl (Rust/reqwest)
      ├── Ollama:    POST {base_url}/v1/chat/completions  ─┐
      ├── OpenAI:    POST {base_url}/v1/chat/completions  ─┤ WireFormat::OpenAI
      │   (llama.cpp, vLLM, LM Studio also use this)      ─┘
      └── Anthropic: POST {base_url}/v1/messages           ── WireFormat::Anthropic

§Design Decisions

Provider vs WireFormat: Provider identifies the server (Ollama, OpenAI, Anthropic) with its own defaults. WireFormat (OpenAI or Anthropic) controls request/response serialization. Multiple providers can share a wire format.
reqwest (async): HTTP client bridged into sync Lua context via tokio::task::block_in_place(|| handle.block_on(...)). Enables future streaming support and aligns with async-openai’s internal transport.
Anthropic kept separate: distinct wire format (content[] blocks, x-api-key header, anthropic-version header).

§Session Management

Conversation history is stored in-memory per Lua VM via SessionStore (Lua app_data).

session_id = nil → create new session (UUID v4), return session_id in response
session_id = "existing-id" → append to existing history and continue

§Rate Limiting & Retry

Automatic retry with exponential backoff for transient errors:

HTTP 429: respects Retry-After header, falls back to exponential backoff
HTTP 5xx: exponential backoff (1s, 2s, 4s, capped at 30s)
Transport errors (timeout, connection reset): exponential backoff
Default: 2 retries (3 total attempts), configurable via opts.max_retries

§Session Persistence

orcs.llm_dump_sessions() → JSON string of all session histories
orcs.llm_load_sessions(json) → restore sessions from JSON

§Technical Debt

Streaming not supported (stream: false fixed)
Multi-turn tool loops not supported (Phase 6: resolve flow)

Functions§

llm_ping_impl: Executes a lightweight connectivity check against the LLM provider.
llm_request_impl: Executes an LLM chat request. Called from capability-gated context.
register_llm_deny_stub: Registers orcs.llm as a deny-by-default stub.

Module llm_command

Module llm_command Copy item path

§Architecture

§Design Decisions

§Session Management

§Rate Limiting & Retry

§Session Persistence

§Technical Debt

Functions§

Module llm_command