Skip to main content Crate anyllm_proxy Copy item path Source admin Admin server: localhost-only config management, request logging, WebSocket live updates. backend Backend HTTP clients for OpenAI, Vertex, Gemini, and Anthropic passthrough. batch Async batch job submission and management (US3).
Batch API HTTP handlers. Types and logic live in anyllm_batch_engine. cache Response caching with in-memory (moka) and optional Redis tier (US1).
Response caching for non-streaming requests. callbacks Webhook callback support for request completion notifications. config Environment-based configuration, TLS client cert setup, URL validation. cost Per-request cost tracking and model pricing (US4). env_parser Pure env-file parser (no I/O, no set_var). Used by startup bootstrap and admin import endpoint.
Pure .anyllm.env-format parser. No I/O, no set_var — safe from any context including tests. fallback Backend fallback chains for transparent failover (US2). integrations Named integration registry (Langfuse, etc.). metrics Request count, success/error tracking, exposed via GET /metrics. openai_tool_policy Provider/model-specific OpenAI-compatible tool request and response normalization. ratelimit Distributed rate limiting via Redis sorted sets (requires redis feature).
Distributed rate limiting via Redis sorted sets. runtime In-process chat completion runtime without HTTP route ownership.
In-process Chat Completions runtime. server Axum HTTP server: routes, middleware (auth, request ID, size/concurrency limits), SSE streaming. tools Optional built-in server-side tools and registry.