Expand description
Multi-provider LLM client for orcs.llm().
§Architecture
Two provider families, one unified HTTP transport (reqwest):
Lua: orcs.llm(prompt, opts)
→ Capability::LLM gate (ctx_fns / child)
→ llm_request_impl (Rust/reqwest)
├── Ollama: POST {base_url}/v1/chat/completions ─┐
├── OpenAI: POST {base_url}/v1/chat/completions ─┤ WireFormat::OpenAI
│ (llama.cpp, vLLM, LM Studio also use this) ─┘
└── Anthropic: POST {base_url}/v1/messages ── WireFormat::Anthropic§Design Decisions
- Provider vs WireFormat:
Provideridentifies the server (Ollama, OpenAI, Anthropic) with its own defaults.WireFormat(OpenAI or Anthropic) controls request/response serialization. Multiple providers can share a wire format. - reqwest (async): HTTP client bridged into sync Lua context via
tokio::task::block_in_place(|| handle.block_on(...)). Enables future streaming support and aligns withasync-openai’s internal transport. - Anthropic kept separate: distinct wire format (
content[]blocks,x-api-keyheader,anthropic-versionheader).
§Session Management
Conversation history is stored in-memory per Lua VM via SessionStore (Lua app_data).
session_id = nil→ create new session (UUID v4), return session_id in responsesession_id = "existing-id"→ append to existing history and continue
§Rate Limiting & Retry
Automatic retry with exponential backoff for transient errors:
- HTTP 429: respects
Retry-Afterheader, falls back to exponential backoff - HTTP 5xx: exponential backoff (1s, 2s, 4s, capped at 30s)
- Transport errors (timeout, connection reset): exponential backoff
- Default: 2 retries (3 total attempts), configurable via
opts.max_retries
§Session Persistence
orcs.llm_dump_sessions()→ JSON string of all session historiesorcs.llm_load_sessions(json)→ restore sessions from JSON
§Technical Debt
- Streaming not supported (
stream: falsefixed) - Multi-turn tool loops not supported (Phase 6: resolve flow)
Functions§
- llm_
ping_ impl - Executes a lightweight connectivity check against the LLM provider.
- llm_
request_ impl - Executes an LLM chat request. Called from capability-gated context.
- register_
llm_ deny_ stub - Registers
orcs.llmas a deny-by-default stub.