LLM Link
π A user-friendly LLM proxy service with built-in support for popular AI coding tools
LLM Link provides zero-configuration access to LLM providers through multiple API formats, with optimized built-in support for Codex CLI, Zed, and Claude Code.
β¨ Key Features
- π― Application-Oriented: Built-in configurations for popular AI coding tools
- β‘ Zero Configuration: One-command startup for common use cases
- π Multi-Protocol: Simultaneous OpenAI, Ollama, and Anthropic API support
- π 9 LLM Providers: OpenAI, Anthropic, Zhipu, Aliyun, Volcengine, Tencent, Longcat, Moonshot, Ollama
- π‘ Dynamic Model Discovery: REST API to query all supported providers and models
- π₯ Hot-Reload Configuration: Update API keys and switch providers without restart
- π οΈ CLI-First: Simple command-line interface with helpful guidance
- π§ Smart Adaptation: Automatic client detection and optimization
- π Production Ready: Built with Rust for performance and reliability
π― Supported Applications
| Application | Protocol | Port | Authentication | Status |
|---|---|---|---|---|
| Codex CLI | OpenAI API | 8088 | Bearer Token | β Ready |
| Zed | Ollama API | 11434 | None | β Ready |
| Claude Code | Anthropic API | 8089 | API Key | β Ready |
π Quick Start
Installation
Option 1: Install from crates.io (Recommended)
Option 2: Build from source
π― Application Mode (Recommended)
Step 1: Set up environment variables
# Required for all applications (choose your provider)
# OR
# OR
# Required for Codex CLI (choose one method)
# OR use CLI parameter: --api-key "your-auth-token"
Step 2: Start for your application
# For Codex CLI
# For Zed
# For Claude Code
π Get Help and Information
# List all supported applications
# Get detailed setup guide for specific application
# Show all CLI options
π§ Protocol Mode (Advanced)
For custom protocol combinations:
# Support multiple protocols simultaneously
π Provider Override
Switch between different LLM providers without changing configuration:
# Use OpenAI GPT-4 instead of default
# Use Anthropic Claude
# Use Ollama local models
# Use Zhipu GLM models
# Use Aliyun Qwen models
Supported Providers:
openai- OpenAI GPT models (default:gpt-4)anthropic- Anthropic Claude models (default:claude-3-5-sonnet-20241022)zhipu- Zhipu GLM models (default:glm-4-flash)aliyun- Aliyun Qwen models (default:qwen-max)volcengine- Volcengine Doubao models (default:doubao-pro-32k)tencent- Tencent Hunyuan models (default:hunyuan-lite)longcat- LongCat models (default:LongCat-Flash-Chat)moonshot- Moonshot Kimi models (default:kimi-k2-turbo-preview)ollama- Ollama local models (default:llama2)
π‘ Discover All Models:
# Query all supported providers and their models via API
|
See API Documentation for details.
βοΈ Environment Variables
Required Variables
# LLM Provider API Keys (choose based on your provider)
# For Zhipu GLM models
# For OpenAI GPT models
# For Anthropic Claude models
# For Aliyun Qwen models
# LLM Link Authentication (required for Codex CLI)
# Bearer token for API access
Optional Variables
# Ollama Configuration
# Ollama server URL
# Logging
# Log level: debug, info, warn, error
# Rust logging (for development)
Using .env File
Create a .env file in the project root:
# .env
ZHIPU_API_KEY=your-zhipu-api-key
LLM_LINK_API_KEY=your-auth-token
OPENAI_API_KEY=sk-xxx
ANTHROPIC_API_KEY=sk-ant-xxx
ALIYUN_API_KEY=your-aliyun-key
Note: The .env file is ignored by git for security. Never commit API keys to version control.
π‘ API Endpoints
LLM Link provides REST APIs for service management and model discovery:
Get Provider and Model Information
# Get all supported providers and their models
# Example response:
{
{
{
}
}
}
Query Specific Provider Models
# Get Zhipu models
|
# List all provider names
|
# Count models per provider
|
Hot-Reload Configuration
# Update API key without restart
# Switch provider
π Full API Documentation: See API_PROVIDERS_MODELS.md
π― Application Setup Guides
Codex CLI Integration
-
Start LLM Link:
# Default: Zhipu GLM-4-Flash # Or use OpenAI GPT-4 # Or use Anthropic Claude -
Configure Codex CLI (
~/.codex/config.toml):[] = "LLM Link" = "http://localhost:8088/v1" = "LLM_LINK_API_KEY" [] = "glm-4-flash" # Or gpt-4, claude-3-5-sonnet-20241022, etc. = "llm_link" -
Use Codex CLI:
π‘ Tip: You can switch providers without changing Codex configuration - just restart llm-link with different --provider and --model flags!
Zed Integration
-
Start LLM Link:
-
Configure Zed (
~/.config/zed/settings.json): -
Use in Zed: Open Zed and use the AI assistant features
Claude Code Integration
-
Start LLM Link:
-
Configure Claude Code:
Create or edit the Claude Code settings file at
~/.claude/settings.json:Configuration Options:
ANTHROPIC_AUTH_TOKEN: Your authentication token (can be any value when using LLM Link)ANTHROPIC_BASE_URL: Point to LLM Link's Claude Code endpoint (http://localhost:8089)API_TIMEOUT_MS: Request timeout in milliseconds (optional, default: 300000)
-
Using Different LLM Providers with Claude Code:
You can use any supported LLM provider with Claude Code by configuring LLM Link:
# Use OpenAI GPT-4 with Claude Code # Use Zhipu GLM models with Claude Code # Use Aliyun Qwen models with Claude Code # Use local Ollama models with Claude CodeNote: The Claude Code settings file (
~/.claude/settings.json) remains the same regardless of which LLM provider you use. LLM Link handles the provider switching transparently.
π§ Advanced Usage
Runtime Configuration Updates
LLM Link provides APIs for runtime configuration management, enabling desktop applications and process managers to update provider settings without manual restarts.
Configuration Management APIs
# Get current configuration
# Get health status and instance ID (for restart verification)
# Validate API key before applying
{
}
# Prepare configuration for restart
{
}
Integration Flow
When integrating LLM Link into desktop applications or process managers:
- Validate Configuration: Call
/api/config/validateto verify the API key - Prepare Update: Call
/api/config/updateto get restart parameters and currentinstance_id - Restart Process: Kill current process and start with new environment variables
- Verify Success: Poll
/api/healthuntilinstance_idchanges and configuration matches
Example Response:
Restart Verification:
# After restart, verify new instance
{
}
Complete Documentation:
- π Configuration Update API - Full API reference and examples
- π Restart Verification Guide - TypeScript/Python integration examples
Multiple Applications Simultaneously
You can run multiple LLM Link instances for different applications:
# Terminal 1: Codex CLI (port 8088)
# Terminal 2: Zed (port 11434)
# Terminal 3: Claude Code (port 8089)
API Endpoints by Application
| Application | Base URL | Key Endpoints |
|---|---|---|
| Codex CLI | http://localhost:8088 |
/v1/chat/completions, /v1/models |
| Zed | http://localhost:11434 |
/api/chat, /api/tags |
| Claude Code | http://localhost:8089 |
/anthropic/v1/messages, /anthropic/v1/models |
π₯ Hot-Reload Configuration
New in v0.3.0: Update API keys and switch providers without restarting the service!
Perfect for desktop applications like z-agent where users need to change settings through a UI.
π Quick Examples
# Check current configuration
# Update API key for OpenAI (no restart needed!)
# Switch to Anthropic instantly
# Validate API key before using
π§ Hot-Reload API Endpoints
| Endpoint | Method | Description |
|---|---|---|
/api/config/current |
GET | Get current provider, model, and hot-reload status |
/api/config/update-key |
POST | Update API key for specific provider |
/api/config/switch-provider |
POST | Switch to different LLM provider |
/api/config/validate-key |
POST | Validate API key and get model list |
β¨ Features
- π Zero Downtime: Configuration changes without service restart
- π Secure: API keys are safely masked in logs
- β Validation: Test API keys before applying changes
- π§΅ Thread Safe: Concurrent requests handled safely
- π Model Discovery: Get available models during validation
π Integration Examples
JavaScript/TypeScript:
const client = ;
// Check if hot-reload is supported
const config = await client.;
if
Python:
=
# Validate and update
=
π Complete Documentation: Hot-Reload API Guide
π οΈ CLI Reference
Application Commands
# List all supported applications
# Get application setup guide
# Start in application mode
CLI Options
)
)
)
)
)
π§ͺ Testing Your Setup
Quick API Tests
# Test Codex CLI setup
# Test Zed setup
# Test Claude Code setup
# Test Claude Code API endpoint
Health Check
# Check service status
π Troubleshooting
Common Issues
-
Missing Environment Variables
# Check what's required for your app -
Port Already in Use
# Find what's using the port # Kill the process -
Authentication Errors
# Verify your API keys are set correctly -
Claude Code Configuration Issues
# Check Claude Code settings file # Verify the settings format is correct # Should contain: ANTHROPIC_AUTH_TOKEN, ANTHROPIC_BASE_URL # Test if LLM Link is accessible from Claude Code -
Provider Switching Issues
# When switching providers, make sure to: # 1. Stop the current LLM Link instance # 2. Set the correct API key for the new provider # 3. Start LLM Link with the new provider # Example: Switch from Anthropic to OpenAI # Stop current instance (Ctrl+C)
ποΈ Architecture
System Overview
External Clients (Codex CLI, Zed, Claude Code)
β
API Layer (HTTP API endpoints)
β’ HTTP Request Parsing
β’ Format Conversion (OpenAI β Ollama β LLM)
β’ Authentication & Authorization
β
Adapter Layer (Client-specific adaptations)
β’ Standard: No special handling
β’ Zed: Add images field
β’ OpenAI: finish_reason correction
β
Service Layer (Business logic)
β’ Model Selection & Validation
β’ Default Model Fallback
β
LLM Layer (LLM communication)
β’ LLM Connector Wrapper
β’ Stream Management
β’ Error Handling
β
LLM Providers (OpenAI, Anthropic, Zhipu, Aliyun, Ollama)
Core Modules
1. API Layer (src/api/)
Handles different protocol HTTP requests and responses.
Modules:
openai.rs- OpenAI API compatible interfaceollama.rs- Ollama API compatible interfaceanthropic.rs- Anthropic API compatible interface (placeholder)convert.rs- Format conversion utilitiesmod.rs- Module exports and common handlers
Responsibilities:
- HTTP request parsing
- Format conversion (OpenAI β Ollama β LLM)
- Client type detection
- Authentication and authorization
- Response formatting
2. Adapter Layer (src/adapters.rs)
Handles client-specific response adaptations.
Adapter Types:
Standard- Standard Ollama client- Preferred format: NDJSON
- Special handling: None
Zed- Zed editor- Preferred format: NDJSON
- Special handling: Add
imagesfield
OpenAI- OpenAI API client (including Codex CLI)- Preferred format: SSE
- Special handling: finish_reason correction
Responsibilities:
- Client type detection (via HTTP headers, User-Agent, configuration)
- Determine preferred streaming format (SSE/NDJSON/JSON)
- Apply client-specific response adaptations
3. Service Layer (src/service.rs)
Business logic layer between API and LLM layers.
Responsibilities:
- Business logic processing
- Model selection and validation
- Default model fallback
- Delegating to LLM layer methods
4. LLM Layer (src/llm/)
LLM communication layer, encapsulates interaction with LLM providers.
Modules:
mod.rs- Client struct and constructortypes.rs- Type definitions (Model, Response, Usage)chat.rs- Non-streaming chatstream.rs- Streaming chatmodels.rs- Model management
Responsibilities:
- Encapsulate llm-connector library
- Unified request/response interface
- Stream response management
- Error handling
5. Configuration (src/settings.rs)
Application configuration management.
Configuration Structure:
Settings
6. Application Support (src/apps/)
Built-in application configuration generators.
Supported Applications:
- Codex CLI - OpenAI API mode
- Zed - Ollama API mode
- Claude Code - Anthropic API mode
Features:
- Zero-configuration startup
- Application-specific optimizations
- Automatic protocol selection
Request Flow
1. External Client Request
β
2. API Layer (openai/ollama endpoints)
ββ HTTP Request Parsing
ββ Format Conversion (API β LLM)
ββ Client Detection
β
3. Service Layer
ββ Business Logic
ββ Model Selection
β
4. LLM Layer
ββ LLM Connector Wrapper
ββ Request Formatting
β
5. LLM Provider
Response Flow
1. LLM Provider Response
β
2. LLM Layer
ββ Stream Processing
ββ Error Handling
β
3. Service Layer
ββ Business Logic
β
4. Adapter Layer
ββ Client-specific Adaptations
β’ Zed: Add images field
β’ OpenAI: finish_reason correction
β’ Standard: No special handling
β
5. API Layer
ββ Format Conversion (LLM β API)
ββ HTTP Response Formatting
β
6. External Client
Design Principles
1. Client Auto-Detection
Detection Priority:
- Force adapter setting (
force_adapter) - Explicit client identifier (
x-clientheader) - User-Agent auto-detection
- Default adapter setting
Supported Client Types:
Standard- Standard Ollama clientZed- Zed editorOpenAI- OpenAI API client (including Codex CLI)
Detection Example:
// 1. Configuration force
force_adapter: "zed"
// 2. Header specification
x-client: zed
// 3. User-Agent detection
User-Agent: Zed/1.0.0 β Zed
User-Agent: OpenAI/1.0 β OpenAI
2. Application-First Design
Built-in configurations for popular applications, zero manual configuration needed.
Benefits:
- One-command startup
- Automatic protocol selection
- Optimized for each application
- Helpful error messages
3. Asynchronous Processing
Uses Tokio async runtime for high concurrency support.
Performance Considerations
- Streaming Response: Real-time data transmission
- Zero-Copy: Minimize data copying
- Async Processing: High concurrency support
π Development
Building from Source
# Clone the repository
# Build for development
# Build for production
# Run tests
Project Structure
llm-link/
βββ src/
β βββ main.rs # Application entry point
β βββ settings.rs # Configuration definitions
β βββ service.rs # Business logic layer
β βββ adapters.rs # Client adapters
β βββ api/ # HTTP API layer
β β βββ mod.rs # AppState, common endpoints
β β βββ convert.rs # Format conversion utilities
β β βββ ollama.rs # Ollama API endpoints
β β βββ openai.rs # OpenAI API endpoints
β β βββ anthropic.rs # Anthropic API endpoints
β βββ llm/ # LLM communication layer
β β βββ mod.rs # Client struct
β β βββ types.rs # Type definitions
β β βββ chat.rs # Non-streaming chat
β β βββ stream.rs # Streaming chat
β β βββ models.rs # Model management
β βββ apps/ # Application config generators
β βββ models/ # Model configurations
βββ docs/ # Documentation
βββ tests/ # Test scripts
βββ Cargo.toml # Rust dependencies
βββ README.md # This file
βββ CHANGELOG.md # Version history
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
π Documentation
- Quick Start Guide - Fast reference for common use cases (δΈζ)
- Changelog - Version history and updates
π License
MIT License
π€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
β Support
If you find LLM Link helpful, please consider giving it a star on GitHub!
Made with β€οΈ for the AI coding