cerebro 1.1.2 - Docs.rs

# Cerebro 🧠

[![Crates.io](https://img.shields.io/crates/v/cerebro.svg)](https://crates.io/crates/cerebro)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

**Cerebro** is a blazing-fast, universal, and storage-agnostic **Memory Layer + Multi-Agent Swarm Engine** for AI Agents and LLM Applications, written in pure Rust.

## Why Cerebro?

While typical vector database wrappers just push raw vectors into a database, `Cerebro` functions as the **Hippocampus** for autonomous AI. It natively understands Agentic Memory structures:

- **Short-Term Episodic Memory** (Conversations)
- **Working Memory** (KV State)
- **Long-Term Semantic Memory** (Vector Search with Temporal Decay)

And now with **SwarmForge**, Cerebro provides a built-in multi-agent orchestration engine — enabling teams of specialized AI agents to collaborate through its three-tier memory system.

## Key Features

- 🚀 **Minimal Overhead**: Powered by a lean async pipeline in Rust, designed for high-scale agentic workloads.
- 🔌 **Universal Storage**: Trait-based backends — swap between `MemoryVectorStore`, `PgVectorStore`, or `Qdrant`.
- 🧠 **Pluggable Compute**: Route embeddings through local models (`Candle`) or remote APIs (`OpenAI`, `Anthropic`).
- 🔄 **Active Consolidation**: Background "Sleep Cycle" worker for autonomous memory pruning and semantic organization.
- 🔍 **Hybrid Search**: Native RRF (Reciprocal Rank Fusion) combining keyword and vector retrieval for highest precision.
- 🐝 **SwarmForge**: Multi-agent swarming engine with sequential, parallel, and hierarchical orchestration patterns.
- 🤖 **Universal LLM**: Native support for Ollama, OpenAI, Gemini, Anthropic, and any OpenAI-compatible API.
- 🌐 **MCP Ready**: Native Model Context Protocol server (`cerebro-mcp`) for AI desktop apps.
- 🦀 **Multi-Language**: Native Python (`PyO3`) and WASM bindings.
- 📄 **Complex Ingestion**: PDF extraction and HTML-aware semantic chunking.

## SwarmForge — Multi-Agent Swarming

Model swarming is an AI design pattern where **multiple specialized agents collaborate** — each with its own expertise, system prompt, and LLM — to solve tasks that a single model can't handle well alone. Think of it as a team of AI specialists instead of one generalist.

Cerebro's SwarmForge is unique because agents don't just pass messages — they share a **three-tier memory system**:

| Memory Tier | How Agents Use It |
|---|---|
| **Working Memory** (KVStore) | Fast state — current task, step count, handoff targets |
| **Episodic Memory** (Conversations) | Full message history per agent within a run |
| **Semantic Memory** (VectorStore) | Agents commit outputs as Documents → other agents recall via vector search |

This means the Security Agent's findings are **semantically searchable** by the Performance Agent that runs after it. Knowledge compounds across the swarm.

### Orchestration Patterns

**Sequential Pipeline** — Each agent's output feeds the next:
```
[Security Agent] → [Performance Agent] → [Style Agent] → Final Report
```

**Parallel Fan-Out / Fan-In** — Multiple agents analyze simultaneously, a merger synthesizes:
```
             ┌→ [Security Agent]  ──┐
  Input ─────┼→ [Performance Agent] ┼→ [Synthesizer] → Output
             └→ [Style Agent]     ──┘
```

**Hierarchical Supervisor** — A supervisor decomposes, delegates, and synthesizes:
```
          [Supervisor Agent]
         /        |         \
  [Backend]   [Frontend]   [Testing]
```

### Supported LLM Providers

Each agent in the swarm can use a **different** LLM provider:

| Provider | Config | Covers |
|---|---|---|
| Ollama | `LlmProvider::Ollama` | Any local model — Llama 3, Mistral, Phi, Gemma |
| OpenAI | `LlmProvider::OpenAI` | GPT-4o, GPT-4, o3, o4-mini |
| Gemini | `LlmProvider::Gemini` | Gemini Pro, Flash, Ultra |
| Anthropic | `LlmProvider::Anthropic` | Claude 4, Sonnet, Haiku, Opus |
| Any OpenAI-compatible | `LlmProvider::OpenAICompatible` | Groq, Together, Mistral, DeepSeek, LM Studio, vLLM |

## Getting Started

```toml
[dependencies]
cerebro = "1.1.1"
```

### Basic Example

```rust
use cerebro::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() {
    let chunker = Arc::new(RecursiveCharacterChunker::new(512, 50));
    let embedder = Arc::new(MockEmbedder::new(1536));
    let store = Arc::new(MemoryVectorStore::new());

    let engine = MemoryEngine::new(chunker, embedder, store);

    let doc = Document::new("The Rust programming language ensures memory safety.");
    engine.ingest_document(doc).await.unwrap();

    let memories = engine.query("What language is safe?", 5).await.unwrap();
    
    for (node, score) in memories {
        println!("Match: {} (Score: {})", node.chunk.text, score);
    }
}
```


### Swarm Example — Multi-Agent Code Review

```rust
use cerebro::prelude::*;
use cerebro::swarm::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() {
    let engine = Arc::new(MemoryEngine::new(
        Arc::new(RecursiveCharacterChunker::new(512, 50)),
        Arc::new(MockEmbedder::new(8)),
        Arc::new(MemoryVectorStore::new()),
    ));
    let memory = Arc::new(CerebroMemoryBus::new(engine, Arc::new(MemoryKVStore::new())));

    let mut orch = SwarmOrchestrator::new(memory);

    orch.register_agent(AgentConfig {
        id: "security".into(),
        name: "Security Reviewer".into(),
        system_prompt: "Analyze code for security vulnerabilities.".into(),
        model: LlmProvider::Ollama { model: "llama3".into(), base_url: "http://localhost:11434".into() },
        tools: vec![], handoff_targets: vec![], max_steps: 10,
    });

    orch.register_agent(AgentConfig {
        id: "perf".into(),
        name: "Performance Reviewer".into(),
        system_prompt: "Analyze code for performance issues.".into(),
        model: LlmProvider::Anthropic { model: "claude-sonnet-4-20250514".into(), api_key: "sk-...".into(), max_tokens: 4096 },
        tools: vec![], handoff_targets: vec![], max_steps: 10,
    });

    let result = orch.execute(
        SwarmPattern::Sequential { agent_order: vec!["security".into(), "perf".into()] },
        "Review this function: fn process(input: &str) { unsafe { ... } }",
    ).await.unwrap();

    println!("{}", result.final_output);
}
```

## Documentation

For in-depth guides and technical details:
- **[USER_GUIDE.md](docs/USER_GUIDE.md)**: Implementation examples and usage guides.
- **[ARCHITECTURE.md](docs/ARCHITECTURE.md)**: Structural layout and data pipelines.
- **[CHANGELOG.md](docs/CHANGELOG.md)**: Release history and version updates.
- **[CEREBRO.md](docs/CEREBRO.md)**: Core registry documentation.

---
*Author: Suraj Kumar Nanda* | [www.surajkumarnanda.com](https://www.surajkumarnanda.com)