Skip to main content

Module llm_command

Module llm_command 

Source
Expand description

Multi-provider LLM client for orcs.llm().

§Architecture

Two provider families, one unified HTTP transport (reqwest):

Lua: orcs.llm(prompt, opts)
  → Capability::LLM gate (ctx_fns / child)
  → llm_request_impl (Rust/reqwest)
      ├── Ollama:    POST {base_url}/v1/chat/completions  ─┐
      ├── OpenAI:    POST {base_url}/v1/chat/completions  ─┤ WireFormat::OpenAI
      │   (llama.cpp, vLLM, LM Studio also use this)      ─┘
      └── Anthropic: POST {base_url}/v1/messages           ── WireFormat::Anthropic

§Design Decisions

  • Provider vs WireFormat: Provider identifies the server (Ollama, OpenAI, Anthropic) with its own defaults. WireFormat (OpenAI or Anthropic) controls request/response serialization. Multiple providers can share a wire format.
  • reqwest (async): HTTP client bridged into sync Lua context via tokio::task::block_in_place(|| handle.block_on(...)). Enables future streaming support and aligns with async-openai’s internal transport.
  • Anthropic kept separate: distinct wire format (content[] blocks, x-api-key header, anthropic-version header).

§Session Management

Conversation history is stored in-memory per Lua VM via SessionStore (Lua app_data).

  • session_id = nil → create new session (UUID v4), return session_id in response
  • session_id = "existing-id" → append to existing history and continue

§Rate Limiting & Retry

Automatic retry with exponential backoff for transient errors:

  • HTTP 429: respects Retry-After header, falls back to exponential backoff
  • HTTP 5xx: exponential backoff (1s, 2s, 4s, capped at 30s)
  • Transport errors (timeout, connection reset): exponential backoff
  • Default: 2 retries (3 total attempts), configurable via opts.max_retries

§Session Persistence

  • orcs.llm_dump_sessions() → JSON string of all session histories
  • orcs.llm_load_sessions(json) → restore sessions from JSON

§Technical Debt

  • Streaming not supported (stream: false fixed)
  • Multi-turn tool loops not supported (Phase 6: resolve flow)

Functions§

llm_ping_impl
Executes a lightweight connectivity check against the LLM provider.
llm_request_impl
Executes an LLM chat request. Called from capability-gated context.
register_llm_deny_stub
Registers orcs.llm as a deny-by-default stub.