<div align="center">
<img src="assets/img/banner.png" alt="Kodegen AI Banner" width="100%" />
</div>
# KODEGEN.ᴀɪ Candle Agent
[](LICENSE)
[](https://www.rust-lang.org/)
**Memory-efficient, blazing-fast MCP tools for code generation agents.**
A high-performance [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) server that provides cognitive memory capabilities for AI agents. Built with Rust and the Candle ML framework, it delivers semantic memory storage, retrieval, and quantum-inspired routing for intelligent code generation workflows.
## Features
- **🧠 Cognitive Memory System** - Store and retrieve code context with semantic understanding
- **⚡ High-Performance** - Rust + SIMD optimizations for blazing-fast embeddings and retrieval
- **🔄 Async Operations** - Non-blocking memory ingestion with progress tracking
- **🎯 Multiple Retrieval Strategies** - Semantic, temporal, and hybrid search modes
- **🌊 Quantum-Inspired Routing** - Advanced memory importance scoring with entanglement
- **📊 Vector Storage** - Support for FAISS, HNSW, and instant-distance backends
- **💾 Persistent Storage** - SurrealDB with embedded SurrealKV for ACID transactions
- **🚀 Hardware Acceleration** - CUDA, Metal, MKL, and Accelerate support
- **🔧 MCP Compatible** - Works with Claude Desktop, Cline, and other MCP clients
## Quick Start
### Prerequisites
- Rust nightly toolchain (automatically configured via `rust-toolchain.toml`)
- For GPU acceleration:
- CUDA 12+ (NVIDIA GPUs)
- Metal (Apple Silicon)
- MKL (Intel CPUs)
### Installation
```bash
# Clone the repository
git clone https://github.com/cyrup-ai/kodegen-candle-agent.git
cd kodegen-candle-agent
# Build with default features
cargo build --release
# Or build with hardware acceleration
cargo build --release --features metal # macOS
cargo build --release --features cuda # NVIDIA GPU
```
### Running the Server
```bash
# Start the MCP server (HTTP transport)
cargo run --release
# The server will start on http://localhost:3000 by default
```
### Configuration for MCP Clients
Add to your MCP client configuration (e.g., Claude Desktop's `claude_desktop_config.json`):
```json
{
"mcpServers": {
"kodegen-candle-agent": {
"command": "cargo",
"args": ["run", "--release"],
"cwd": "/path/to/kodegen-candle-agent"
}
}
}
```
## Usage
The server provides four MCP tools for memory operations:
### 1. Memorize Content
Ingest files or directories into a named memory library:
```json
{
"tool": "memory_memorize",
"arguments": {
"input": "/path/to/your/codebase",
"library": "my-project"
}
}
```
Returns a `session_id` for tracking the async operation.
### 2. Check Memorization Status
Poll the progress of a memorization task:
```json
{
"tool": "memory_check_memorize_status",
"arguments": {
"session_id": "your-session-id"
}
}
```
Returns status (`IN_PROGRESS`, `COMPLETED`, `FAILED`) with progress details.
### 3. Recall Memories
Search for relevant memories using semantic similarity:
```json
{
"tool": "memory_recall",
"arguments": {
"query": "authentication logic",
"library": "my-project",
"top_k": 5
}
}
```
Returns ranked memories with similarity scores and importance metrics.
### 4. List Memory Libraries
Enumerate all available memory libraries:
```json
{
"tool": "memory_list_libraries",
"arguments": {}
}
```
## Architecture
```
┌─────────────────────────────────────────────────────────┐
│ MCP Tools Layer │
│ (memorize, recall, check_status, list_libraries) │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────────────▼───────────────────────────────────┐
│ Memory Coordinator Pool │
│ (Per-library coordinator management) │
└─────────────────────┬───────────────────────────────────┘
│
┌─────────────┼─────────────┐
│ │ │
┌───────▼─────┐ ┌────▼────┐ ┌──────▼──────┐
│ Graph DB │ │ Vector │ │ Cognitive │
│ (SurrealDB) │ │ Storage │ │ Workers │
└─────────────┘ └─────────┘ └─────────────┘
```
### Key Components
- **Memory Coordinator** - Orchestrates operations across graph DB, vector store, and cognitive workers
- **Quantum Routing** - Uses quantum-inspired algorithms for intelligent memory importance scoring
- **Committee Evaluation** - Multiple evaluators vote on memory relevance and importance
- **Background Workers** - Async processing for embeddings, indexing, and memory decay
- **Transaction Manager** - ACID guarantees for memory operations
## Development
### Building with Features
```bash
# Full cognitive capabilities
cargo build --features full-cognitive
# Specific vector backends
cargo build --features faiss-vector
cargo build --features hnsw-vector
# API server (HTTP endpoint for memory operations)
cargo build --features api
# Development mode (debug + desktop features)
cargo build --features dev
```
### Running Tests
```bash
# Run all tests
cargo test
# Run specific test module
cargo test --test memory
# Run with output
cargo test -- --nocapture
# Run a single test
cargo test test_quantum_mcts
```
### Running the Example
```bash
# Run the demo that exercises all tools
cargo run --example candle_agent_demo --release
```
## Performance
The system is optimized for production use:
- **Zero-allocation patterns** with `arrayvec` and `smallvec` for hot paths
- **SIMD optimizations** via `kodegen_simd` for vector operations
- **Lazy loading** of embedding models and coordinators
- **Connection pooling** for efficient database access
- **Async architecture** throughout for maximum concurrency
Typical performance on Apple M1 Pro:
- Embedding generation: ~500 tokens/sec (Stella 400M)
- Memory ingestion: ~1000 files/min (with chunking and indexing)
- Semantic search: <10ms for top-5 retrieval (1M+ memories)
## Configuration
Memory system behavior can be configured via environment variables or the `MemoryConfig` struct:
```rust
use kodegen_candle_agent::memory::utils::config::MemoryConfig;
let config = MemoryConfig {
database: DatabaseConfig {
connection_string: "surrealkv://memory.db".to_string(),
namespace: "kodegen".to_string(),
database: "memories".to_string(),
username: None,
password: None,
},
vector: VectorConfig {
backend: VectorBackend::InstantDistance,
dimension: 1024,
},
cognitive: CognitiveConfig::default(),
};
```
## Embedding Models
The system uses the Stella embedding model family by default:
- **stella_en_400M_v5** - 400M parameter English model (default)
- High quality semantic representations optimized for code and text
Models are automatically downloaded from HuggingFace Hub on first use.
## Contributing
Contributions are welcome! Please see our contributing guidelines.
## License
This project is dual-licensed under:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
- MIT License ([LICENSE-MIT](LICENSE-MIT))
You may choose either license for your purposes.
## Links
- **Homepage**: [https://kodegen.ai](https://kodegen.ai)
- **Repository**: [https://github.com/cyrup-ai/kodegen-candle-agent](https://github.com/cyrup-ai/kodegen-candle-agent)
- **MCP Protocol**: [https://modelcontextprotocol.io](https://modelcontextprotocol.io)
---
Built with ❤️ by the KODEGEN.ᴀɪ team