LLM Link
🚀 A user-friendly LLM proxy service with built-in support for popular AI coding tools
LLM Link provides zero-configuration access to LLM providers through multiple API formats, with optimized built-in support for Codex CLI and Zed.
✨ Key Features
- 🎯 Application-Oriented: Built-in configurations for popular AI coding tools
- ⚡ Zero Configuration: One-command startup for common use cases
- 🔄 Multi-Protocol: Simultaneous OpenAI, Ollama, and Anthropic API support
- 🔀 9 LLM Providers: OpenAI, Anthropic, Zhipu, Aliyun, Volcengine, Tencent, Longcat, Moonshot, Ollama
- 📡 Dynamic Model Discovery: REST API to query all supported providers and models
- 📦 Rust Library: Use as a crate for direct access to provider and model information
- 🔥 Hot-Reload Configuration: Update API keys and switch providers without restart
- 🛠️ CLI-First: Simple command-line interface with helpful guidance
- 🔧 Smart Adaptation: Automatic client detection and optimization
- 🚀 Production Ready: Built with Rust for performance and reliability
🎯 Supported Applications
| Application | Protocol | Port | Authentication | Status |
|---|---|---|---|---|
| Codex CLI | OpenAI API | 8088 | Bearer Token | ✅ Ready |
| Zed | Ollama API | 11434 | None | ✅ Ready |
🚀 Quick Start
Installation
Option 0: Install via Homebrew (macOS)
# Run the tap command once per machine so Homebrew knows where the formula lives
📝 如果要取消可执行
brew untap lipish/llm-link,之后重新 tap 再安装即可。
Option 1: Install from crates.io (Recommended)
Option 1.5: Install via pip (macOS / Linux)
# First run downloads the matching prebuilt binary into ~/.cache/llm-link
🐍 安装包名为
pyllmlink,安装后仍提供llm-linkCLI。LLM_LINK_CACHE可重定向缓存目录,LLM_LINK_DOWNLOAD_BASE可指向自建 release 镜像。
Option 2: Build from source
🎯 Application Mode (Recommended)
Step 1: Set up environment variables
# Required for all applications (choose your provider)
# OR
# OR
# Required for Codex CLI (choose one method)
# OR use CLI parameter: --api-key "your-auth-token"
Step 2: Start for your application
# For Codex CLI
# For Zed
📋 Get Help and Information
# List all supported applications
# Get detailed setup guide for specific application
# Show all CLI options
🔧 Protocol Mode (Advanced)
For custom protocol combinations:
# Support multiple protocols simultaneously
🔄 Provider Override
Switch between different LLM providers without changing configuration:
# Use OpenAI GPT-4 instead of default
# Use Anthropic Claude
# Use Ollama local models
# Use Zhipu GLM models
# Use Aliyun Qwen models
Supported Providers:
openai- OpenAI GPT models (default:gpt-4)anthropic- Anthropic Claude models (default:claude-3-5-sonnet-20241022)zhipu- Zhipu GLM models (default:glm-4-flash)aliyun- Aliyun Qwen models (default:qwen-max)volcengine- Volcengine Doubao models (default:doubao-pro-32k)tencent- Tencent Hunyuan models (default:hunyuan-lite)longcat- LongCat models (default:LongCat-Flash-Chat)moonshot- Moonshot Kimi models (default:kimi-k2-turbo-preview)ollama- Ollama local models (default:llama2)
Volcengine Doubao: Logical Models vs Endpoint IDs
Volcengine Ark 对 Doubao 模型采用「逻辑模型名」和「接入点 ID (ep-xxxx)」两层概念:
- 在 Ark 控制台 / 文档中看到的
doubao-seed-1.6、doubao-seed-code-preview-latest等是逻辑模型名。 - 通过 OpenAI 兼容接口调用时,通常需要使用你账号下创建的 接入点 ID(例如
ep-20251115213103-t9sf2)作为model。
llm-link 的处理方式是:
- 对外协议层(/api/chat、/api/tags、Zed 等)仍使用逻辑模型名,方便阅读与配置。
- 在发送请求给 Volcengine 之前,由 Normalizer 层的
ModelResolver将逻辑名映射为真正的ep-...。
映射规则(优先级从高到低):
-
本地覆盖文件
model-overrides.yaml(仓库根目录,可选,本地配置,已加入.gitignore):volcengine: doubao-seed-code-preview-latest: "ep-your-seedcode-endpoint-id" doubao-seed-1.6-thinking: "ep-your-thinking-endpoint-id"如果存在该文件且命中 provider+逻辑名,则总是使用这里配置的
ep-...。 -
Volcengine 默认规则(无 overrides 时):
- 如果请求中的
model本身就是ep-...,则直接透传。 - 否则(逻辑名),使用配置中的默认模型(通常来自 CLI
--model ep-...)。
- 如果请求中的
-
其他 Provider:
- 目前保持简单策略:使用请求中的
model(为空时回退到默认模型)。
- 目前保持简单策略:使用请求中的
推荐实践:
-
想要一个进程只用一个 Doubao endpoint:
然后在客户端中使用逻辑名(例如
doubao-seed-code-preview-latest),llm-link 会自动映射到默认ep-...。 -
想要在同一进程中为多个 Doubao 逻辑模型配置不同 endpoint:
在根目录创建
model-overrides.yaml(可从examples/model-overrides.example.yaml拷贝),按需要为每个逻辑名指定对应的ep-...,无需修改代码或提交配置。
💡 Discover All Models:
# Query all supported providers and their models via API
|
See API Documentation for details.
⚙️ Environment Variables
Required Variables
# LLM Provider API Keys (choose based on your provider)
# For Zhipu GLM models
# For OpenAI GPT models
# For Anthropic Claude models
# For Aliyun Qwen models
# LLM Link Authentication (required for Codex CLI)
# Bearer token for API access
Optional Variables
# Ollama Configuration
# Ollama server URL
# Logging
# Log level: debug, info, warn, error
# Rust logging (for development)
Using .env File
Create a .env file in the project root:
# .env
ZHIPU_API_KEY=your-zhipu-api-key
LLM_LINK_API_KEY=your-auth-token
OPENAI_API_KEY=sk-xxx
ANTHROPIC_API_KEY=sk-ant-xxx
ALIYUN_API_KEY=your-aliyun-key
Note: The .env file is ignored by git for security. Never commit API keys to version control.
📦 As a Rust Library
Besides running as a standalone service, llm-link can also be used as a Rust library to access provider and model information directly in your applications.
Add Dependency
Add llm-link to your Cargo.toml:
[]
= "0.3.4"
Get Providers and Models
Use the library APIs to access supported providers and their models without starting a service:
use ModelsConfig;
use ProviderRegistry;
Library Features
- 🔍 Provider Discovery: List all available LLM providers
- 📋 Model Information: Get detailed model specifications for each provider
- ⚡ No Network Overhead: Direct access without HTTP requests
- 🛠️ Type Safe: Full Rust type safety and compile-time checks
- 🔄 Dynamic Loading: Automatically loads from embedded configuration
Use Cases
- Model Selection UI: Build dynamic interfaces for model selection
- Configuration Tools: Create setup utilities for different providers
- Monitoring Applications: Track available models and providers
- Integration Libraries: Build higher-level abstractions on top of llm-link
Example
Check out the library usage example for a complete demonstration of how to use llm-link as a library.
Run the example with:
📡 API Endpoints
LLM Link provides REST APIs for service management and model discovery:
Get Provider and Model Information
# Get all supported providers and their models
# Example response:
{
{
{
}
}
}
Query Specific Provider Models
# Get Zhipu models
|
# List all provider names
|
# Count models per provider
|
Hot-Reload Configuration
# Update API key without restart
# Switch provider
📚 Full API Documentation: See API_PROVIDERS_MODELS.md
🎯 Application Setup Guides
Codex CLI Integration
-
Start LLM Link:
# Default: Zhipu GLM-4-Flash # Or use OpenAI GPT-4 # Or use Anthropic Claude -
Configure Codex CLI (
~/.codex/config.toml):[] = "LLM Link" = "http://localhost:8088/v1" = "LLM_LINK_API_KEY" [] = "glm-4-flash" # Or gpt-4, claude-3-5-sonnet-20241022, etc. = "llm_link" -
Use Codex CLI:
💡 Tip: You can switch providers without changing Codex configuration - just restart llm-link with different --provider and --model flags!
Zed Integration
-
Start LLM Link:
-
Configure Zed (
~/.config/zed/settings.json): -
Use in Zed: Open Zed and use the AI assistant features
Claude Code Integration
-
Start LLM Link:
-
Configure Claude Code:
Create or edit the Claude Code settings file at
~/.claude/settings.json:Configuration Options:
ANTHROPIC_AUTH_TOKEN: Your authentication token (can be any value when using LLM Link)ANTHROPIC_BASE_URL: Point to LLM Link's Claude Code endpoint (http://localhost:8089)API_TIMEOUT_MS: Request timeout in milliseconds (optional, default: 300000)
-
Using Different LLM Providers with Claude Code:
You can use any supported LLM provider with Claude Code by configuring LLM Link:
# Use OpenAI GPT-4 with Claude Code # Use Zhipu GLM models with Claude Code # Use Aliyun Qwen models with Claude Code # Use local Ollama models with Claude CodeNote: The Claude Code settings file (
~/.claude/settings.json) remains the same regardless of which LLM provider you use. LLM Link handles the provider switching transparently.
🔧 Advanced Usage
Runtime Configuration Updates
LLM Link provides APIs for runtime configuration management, enabling desktop applications and process managers to update provider settings without manual restarts.
Configuration Management APIs
# Get current configuration
# Get health status and instance ID (for restart verification)
# Validate API key before applying
{
}
# Prepare configuration for restart
{
}
Integration Flow
When integrating LLM Link into desktop applications or process managers:
- Validate Configuration: Call
/api/config/validateto verify the API key - Prepare Update: Call
/api/config/updateto get restart parameters and currentinstance_id - Restart Process: Kill current process and start with new environment variables
- Verify Success: Poll
/api/healthuntilinstance_idchanges and configuration matches
Example Response:
Restart Verification:
# After restart, verify new instance
{
}
Complete Documentation:
- 📖 Configuration Update API - Full API reference and examples
- 📖 Restart Verification Guide - TypeScript/Python integration examples
Multiple Applications Simultaneously
You can run multiple LLM Link instances for different applications:
# Terminal 1: Codex CLI (port 8088)
# Terminal 2: Zed (port 11434)
# Terminal 3: Claude Code (port 8089)
API Endpoints by Application
| Application | Base URL | Key Endpoints |
|---|---|---|
| Codex CLI | http://localhost:8088 |
/v1/chat/completions, /v1/models |
| Zed | http://localhost:11434 |
/api/chat, /api/tags |
| Claude Code | http://localhost:8089 |
/anthropic/v1/messages, /anthropic/v1/models |
🔥 Hot-Reload Configuration
New in v0.3.0: Update API keys and switch providers without restarting the service!
Perfect for desktop applications like z-agent where users need to change settings through a UI.
🚀 Quick Examples
# Check current configuration
# Update API key for OpenAI (no restart needed!)
# Switch to Anthropic instantly
# Validate API key before using
🔧 Hot-Reload API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/config/current |
GET | Get current provider, model, and hot-reload status |
/api/config/update-key |
POST | Update API key for specific provider |
/api/config/switch-provider |
POST | Switch to different LLM provider |
/api/config/validate-key |
POST | Validate API key and get model list |
✨ Features
- 🔄 Zero Downtime: Configuration changes without service restart
- 🔒 Secure: API keys are safely masked in logs
- ✅ Validation: Test API keys before applying changes
- 🧵 Thread Safe: Concurrent requests handled safely
- 📋 Model Discovery: Get available models during validation
📚 Integration Examples
JavaScript/TypeScript:
const client = ;
// Check if hot-reload is supported
const config = await client.;
if
Python:
=
# Validate and update
=
📖 Complete Documentation: Hot-Reload API Guide
🛠️ CLI Reference
Application Commands
# List all supported applications
# Get application setup guide
# Start in application mode
CLI Options
)
)
)
)
)
🧪 Testing Your Setup
Quick API Tests
# Test Codex CLI setup
# Test Zed setup
# Test Claude Code setup
# Test Claude Code API endpoint
Health Check
# Check service status
🔍 Troubleshooting
Common Issues
-
Missing Environment Variables
# Check what's required for your app -
Port Already in Use
# Find what's using the port # Kill the process -
Authentication Errors
# Verify your API keys are set correctly -
Claude Code Configuration Issues
# Check Claude Code settings file # Verify the settings format is correct # Should contain: ANTHROPIC_AUTH_TOKEN, ANTHROPIC_BASE_URL # Test if LLM Link is accessible from Claude Code -
Provider Switching Issues
# When switching providers, make sure to: # 1. Stop the current LLM Link instance # 2. Set the correct API key for the new provider # 3. Start LLM Link with the new provider # Example: Switch from Anthropic to OpenAI # Stop current instance (Ctrl+C)
🏗️ Architecture
System Overview
External Clients (Codex CLI, Zed, Claude Code)
↓
API Layer (HTTP API endpoints)
• HTTP Request Parsing
• Format Conversion (OpenAI ↔ Ollama ↔ LLM)
• Authentication & Authorization
↓
Adapter Layer (Client-specific adaptations)
• Standard: No special handling
• Zed: Add images field
• OpenAI: finish_reason correction
↓
Service Layer (Business logic)
• Model Selection & Validation
• Default Model Fallback
↓
LLM Layer (LLM communication)
• LLM Connector Wrapper
• Stream Management
• Error Handling
↓
LLM Providers (OpenAI, Anthropic, Zhipu, Aliyun, Ollama)
Core Modules
1. API Layer (src/api/)
Handles different protocol HTTP requests and responses.
Modules:
openai.rs- OpenAI API compatible interfaceollama.rs- Ollama API compatible interfaceanthropic.rs- Anthropic API compatible interface (placeholder)convert.rs- Format conversion utilitiesmod.rs- Module exports and common handlers
Responsibilities:
- HTTP request parsing
- Format conversion (OpenAI ↔ Ollama ↔ LLM)
- Client type detection
- Authentication and authorization
- Response formatting
2. Adapter Layer (src/adapters.rs)
Handles client-specific response adaptations.
Adapter Types:
Standard- Standard Ollama client- Preferred format: NDJSON
- Special handling: None
Zed- Zed editor- Preferred format: NDJSON
- Special handling: Add
imagesfield
OpenAI- OpenAI API client (including Codex CLI)- Preferred format: SSE
- Special handling: finish_reason correction
Responsibilities:
- Client type detection (via HTTP headers, User-Agent, configuration)
- Determine preferred streaming format (SSE/NDJSON/JSON)
- Apply client-specific response adaptations
3. Service Layer (src/service.rs)
Business logic layer between API and LLM layers.
Responsibilities:
- Business logic processing
- Model selection and validation
- Default model fallback
- Delegating to LLM layer methods
4. Normalizer Layer (src/normalizer/)
Protocol normalization and LLM communication layer, encapsulates interaction with LLM providers.
Modules:
mod.rs- Unified client struct and constructortypes.rs- Type definitions (Model, Response, Usage)chat.rs- Non-streaming chatstream.rs- Streaming chatmodels.rs- Model management
Responsibilities:
- Encapsulate llm-connector library
- Normalize requests/responses across different provider protocols
- Stream response management
- Error handling
5. Configuration (src/settings.rs)
Application configuration management.
Configuration Structure:
Settings
6. Application Support (src/apps/)
Built-in application configuration generators.
Supported Applications:
- Codex CLI - OpenAI API mode
- Zed - Ollama API mode
- Claude Code - Anthropic API mode
Features:
- Zero-configuration startup
- Application-specific optimizations
- Automatic protocol selection
Request Flow
1. External Client Request
↓
2. API Layer (openai/ollama endpoints)
├─ HTTP Request Parsing
├─ Format Conversion (API → LLM)
└─ Client Detection
↓
3. Service Layer
├─ Business Logic
└─ Model Selection
↓
4. Normalizer Layer
├─ LLM Connector Wrapper
└─ Request Formatting
↓
5. LLM Provider
Response Flow
1. LLM Provider Response
↓
2. Normalizer Layer
├─ Stream Processing
└─ Error Handling
↓
3. Service Layer
└─ Business Logic
↓
4. Adapter Layer
└─ Client-specific Adaptations
• Zed: Add images field
• OpenAI: finish_reason correction
• Standard: No special handling
↓
5. API Layer
├─ Format Conversion (LLM → API)
└─ HTTP Response Formatting
↓
6. External Client
Design Principles
1. Client Auto-Detection
Detection Priority:
- Force adapter setting (
force_adapter) - Explicit client identifier (
x-clientheader) - User-Agent auto-detection
- Default adapter setting
Supported Client Types:
Standard- Standard Ollama clientZed- Zed editorOpenAI- OpenAI API client (including Codex CLI)
Detection Example:
// 1. Configuration force
force_adapter: "zed"
// 2. Header specification
x-client: zed
// 3. User-Agent detection
User-Agent: Zed/1.0.0 → Zed
User-Agent: OpenAI/1.0 → OpenAI
2. Application-First Design
Built-in configurations for popular applications, zero manual configuration needed.
Benefits:
- One-command startup
- Automatic protocol selection
- Optimized for each application
- Helpful error messages
3. Asynchronous Processing
Uses Tokio async runtime for high concurrency support.
Performance Considerations
- Streaming Response: Real-time data transmission
- Zero-Copy: Minimize data copying
- Async Processing: High concurrency support
🚀 Development
Building from Source
# Clone the repository
# Build for development
# Build for production
# Run tests
Project Structure
llm-link/
├── src/
│ ├── main.rs # Application entry point
│ ├── settings.rs # Configuration definitions
│ ├── service.rs # Business logic layer
│ ├── adapters.rs # Client adapters
│ ├── api/ # HTTP API layer
│ │ ├── mod.rs # AppState, common endpoints
│ │ ├── convert.rs # Format conversion utilities
│ │ ├── ollama.rs # Ollama API endpoints
│ │ ├── openai.rs # OpenAI API endpoints
│ │ └── anthropic.rs # Anthropic API endpoints
│ ├── normalizer/ # Protocol normalization and LLM communication layer
│ │ ├── mod.rs # Unified client struct
│ │ ├── types.rs # Type definitions
│ │ ├── chat.rs # Non-streaming chat
│ │ ├── stream.rs # Streaming chat
│ │ └── models.rs # Model management
│ ├── apps/ # Application config generators
│ └── models/ # Model configurations
├── docs/ # Documentation
├── tests/ # Test scripts
├── Cargo.toml # Rust dependencies
├── README.md # This file
└── CHANGELOG.md # Version history
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📚 Documentation
- 📖 Documentation Center - Complete documentation index
- 🚀 Quick Start - Quick getting started guide
- 🔌 Application Integration - Zed, Claude Code, Codex CLI integration
- ⚙️ Configuration Guide - Detailed configuration instructions
- 📡 API Documentation - API interface documentation
- 📋 Changelog - Version history and updates
📄 License
MIT License
🤝 Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
⭐ Support
If you find LLM Link helpful, please consider giving it a star on GitHub!
Made with ❤️ for the AI coding