EdgeQuake LLM
edgequake-llm is a Rust AI runtime with a single abstraction over cloud APIs,
local gateways, enterprise deployments, and testing backends. It ships
first-class support for chat, streaming, tool calling, embeddings, image
generation, caching, retries, rate limiting, cost tracking, and release-grade
CI/CD.
Python users should use edgequake-litellm, the LiteLLM-compatible package backed by this crate.
What It Covers
- One trait-based surface for LLMs, embeddings, and Rust image generation.
- Production backends: OpenAI, Azure OpenAI, Anthropic, Gemini, Vertex AI, xAI, OpenRouter, Mistral, AWS Bedrock.
- Local and gateway backends: Ollama, LM Studio, VSCode Copilot proxy, generic OpenAI-compatible APIs.
- Additional embedding backend: Jina.
- Image generation backends in the Rust crate: Gemini image generation, Vertex Imagen, FAL, mock image generation.
- Operational layers: caching, retry, rate limiting, cost tracking, tracing, reranking, mock providers.
Install
[]
= "0.5.1"
= { = "1", = ["macros", "rt-multi-thread"] }
bedrock is feature-gated:
[]
= { = "0.5.1", = ["bedrock"] }
Note: the base crate declares rust-version = 1.83.0, but AWS Bedrock dependencies currently require a newer toolchain when the bedrock feature is enabled. Use stable Rust for release builds.
Quick Start
use ;
async
Environment:
Provider Matrix
| Provider | Prefix / Type | Chat | Stream | Tools | Embeddings | Notes |
|---|---|---|---|---|---|---|
| OpenAI | openai |
Yes | Yes | Yes | Yes | GPT, o-series, vision |
| Azure OpenAI | azure |
Yes | Yes | Yes | Yes | Deployment-based |
| Anthropic | anthropic |
Yes | Yes | Yes | No | Claude thinking + caching |
| Gemini | gemini |
Yes | Yes | Yes | Yes | Google AI Studio |
| Vertex AI | vertexai |
Yes | Yes | Yes | Yes | Gemini on GCP auth |
| xAI | xai |
Yes | Yes | Yes | No | Grok models |
| OpenRouter | openrouter |
Yes | Yes | Yes | No | Multi-provider gateway |
| Mistral | mistral |
Yes | Yes | Yes | Yes | La Plateforme |
| AWS Bedrock | bedrock |
Yes | Yes | Yes | Yes | Feature-gated |
| HuggingFace | huggingface |
Yes | Yes | Limited | No | Inference API |
| OpenAI Compatible | openai-compatible |
Yes | Yes | Yes | Yes | Groq, Together, DeepSeek, custom |
| Ollama | ollama |
Yes | Yes | Yes | Yes | Local runtime |
| LM Studio | lmstudio |
Yes | Yes | Yes | Yes | Local OpenAI-compatible |
| VSCode Copilot | vscode-copilot |
Yes | Yes | Yes | Yes | Requires proxy |
| Jina | embedding only | No | No | No | Yes | Dedicated embeddings |
| Mock | mock |
Yes | No | Yes | Yes | Tests and offline dev |
Image Generation Providers
Rust-only image generation support is exposed through ImageGenProvider and
ImageGenFactory:
| Provider | Type | Auth / Environment | Notes |
|---|---|---|---|
| Gemini image generation | GeminiImageGenProvider |
GEMINI_API_KEY or Vertex AI auth |
Default model: gemini-2.5-flash-image |
| Vertex Imagen | VertexAIImageGen |
GOOGLE_CLOUD_PROJECT and ADC / GOOGLE_ACCESS_TOKEN |
Default model: imagen-4.0-generate-001 |
| FAL | FalImageGen |
FAL_KEY |
Default model: fal-ai/flux/dev |
| Mock | MockImageGenProvider |
none | Tests and offline development |
Common Setup
| Provider | Required environment |
|---|---|
| OpenAI | OPENAI_API_KEY |
| Azure OpenAI | AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, AZURE_OPENAI_DEPLOYMENT_NAME |
| Anthropic | ANTHROPIC_API_KEY |
| Gemini | GEMINI_API_KEY or GOOGLE_API_KEY |
| Vertex AI | GOOGLE_CLOUD_PROJECT and ADC / GOOGLE_ACCESS_TOKEN |
| xAI | XAI_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
| Mistral | MISTRAL_API_KEY |
| AWS Bedrock | standard AWS credential chain plus AWS_REGION |
| HuggingFace | HF_TOKEN or HUGGINGFACE_TOKEN |
| OpenAI Compatible | OPENAI_COMPATIBLE_BASE_URL, optional OPENAI_COMPATIBLE_API_KEY |
| Ollama | optional OLLAMA_HOST |
| LM Studio | optional LMSTUDIO_HOST |
| VSCode Copilot | optional VSCODE_COPILOT_PROXY_URL |
| Jina | JINA_API_KEY |
Image generation environment:
| Provider | Required environment |
|---|---|
| Gemini image generation | GEMINI_API_KEY or Vertex AI auth |
| Vertex Imagen | GOOGLE_CLOUD_PROJECT and ADC / GOOGLE_ACCESS_TOKEN |
| FAL | FAL_KEY |
Factory Usage
ProviderFactory is the fastest way to wire environments or provider/model routing:
use ;
let = from_env?;
println!;
let = create_with_model?;
let custom = create_llm_provider?;
For generic OpenAI-compatible routing, set:
For Rust image generation, use:
use ;
async
Python Package
edgequake-litellm is the Python package in this repo. It is a drop-in LiteLLM replacement backed by the Rust runtime:
=
Install:
See edgequake-litellm/README.md for provider routing, migration notes, wheel coverage, and release instructions.
The Python package does not expose the Rust image-generation APIs yet.
Development
Local validation:
Python package validation:
Release
Release guides:
docs/providers.md: provider-by-provider setupdocs/releasing.md: release checklist, tags, registry setupdocs/release-cycle.md: end-to-end CI/CD flowCHANGELOG.md: release notes for the Rust crateedgequake-litellm/CHANGELOG.md: release notes for the Python package
Tag conventions:
- Rust crate:
vX.Y.Z - Python package:
py-vX.Y.Z
Both publish workflows validate versions before publishing and can attach release artifacts to GitHub Releases.
License
Apache-2.0. See LICENSE-APACHE.