Skip to main content

Crate anyllm_proxy

Crate anyllm_proxy 

Source

Modules§

admin
Admin server: localhost-only config management, request logging, WebSocket live updates.
backend
Backend HTTP clients for OpenAI, Vertex, Gemini, and Anthropic passthrough.
batch
Async batch job submission and management (US3). Batch API HTTP handlers. Types and logic live in anyllm_batch_engine.
cache
Response caching with in-memory (moka) and optional Redis tier (US1). Response caching for non-streaming requests.
callbacks
Webhook callback support for request completion notifications.
config
Environment-based configuration, TLS client cert setup, URL validation.
cost
Per-request cost tracking and model pricing (US4).
env_parser
Pure env-file parser (no I/O, no set_var). Used by startup bootstrap and admin import endpoint. Pure .anyllm.env-format parser. No I/O, no set_var — safe from any context including tests.
fallback
Backend fallback chains for transparent failover (US2).
integrations
Named integration registry (Langfuse, etc.).
metrics
Request count, success/error tracking, exposed via GET /metrics.
openai_tool_policy
Provider/model-specific OpenAI-compatible tool request and response normalization.
ratelimit
Distributed rate limiting via Redis sorted sets (requires redis feature). Distributed rate limiting via Redis sorted sets.
runtime
In-process chat completion runtime without HTTP route ownership. In-process Chat Completions runtime.
server
Axum HTTP server: routes, middleware (auth, request ID, size/concurrency limits), SSE streaming.
tools
Optional built-in server-side tools and registry.