Crate anyllm_proxy

Source

Modules§

admin: Admin server: localhost-only config management, request logging, WebSocket live updates.
backend: Backend HTTP clients for OpenAI, Vertex, Gemini, and Anthropic passthrough.
batch: Async batch job submission and management (US3). Batch API HTTP handlers. Types and logic live in anyllm_batch_engine.
cache: Response caching with in-memory (moka) and optional Redis tier (US1). Response caching for non-streaming requests.
callbacks: Webhook callback support for request completion notifications.
config: Environment-based configuration, TLS client cert setup, URL validation.
cost: Per-request cost tracking and model pricing (US4).
env_parser: Pure env-file parser (no I/O, no set_var). Used by startup bootstrap and admin import endpoint. Pure .anyllm.env-format parser. No I/O, no set_var — safe from any context including tests.
fallback: Backend fallback chains for transparent failover (US2).
integrations: Named integration registry (Langfuse, etc.).
metrics: Request count, success/error tracking, exposed via GET /metrics.
openai_tool_policy: Provider/model-specific OpenAI-compatible tool request and response normalization.
ratelimit: Distributed rate limiting via Redis sorted sets (requires redis feature). Distributed rate limiting via Redis sorted sets.
runtime: In-process chat completion runtime without HTTP route ownership. In-process Chat Completions runtime.
server: Axum HTTP server: routes, middleware (auth, request ID, size/concurrency limits), SSE streaming.
tools: Optional built-in server-side tools and registry.

Crate anyllm_proxy

Crate anyllm_proxy Copy item path

Modules§

Crate anyllm_proxy