cerebro 1.1.7 - Docs.rs

# Cerebro User Guide

Welcome to the Cerebro User Guide! This document covers how to implement the memory protocol in your application.

## 1. Initializing the Engine

Cerebro is designed to be modular. You might mix and match your compute backends and storage layers based on your privacy and scale needs.

```rust
use std::sync::Arc;
use cerebro::prelude::*;
use cerebro::compute::local::LocalEmbedder;
use cerebro::storage::qdrant::QdrantVectorStore;

#[tokio::main]
async fn main() {
    // 1. Semantic Chunking (HTML aware)
    let chunker = Arc::new(HtmlSemanticChunker::new(1024));
    
    // 2. Local Privacy-First Embeddings (Runs 100% on CPU/Metal)
    // Requires feature = ["local_models"]
    let embedder = Arc::new(LocalEmbedder::new().await.unwrap());
    
    // 3. Scalable Persistent Storage
    let qdrant = Arc::new(QdrantVectorStore::new("http://localhost:6334", "memories").await.unwrap());

    // Connect the neurons!
    let engine = MemoryEngine::new(chunker, embedder, store);
    
    println!("Brain is online.");
}
```

## 2. Ingesting Complex Data

Cerebro supports basic text, HTML, and PDF ingestion.

```rust
use cerebro::ingest::pdf::PdfIngestor;

// Ingest a PDF file directly
let pdf_ingestor = PdfIngestor::new();
let docs = pdf_ingestor.ingest("whitepaper.pdf").await.unwrap();

for doc in docs {
    engine.ingest_document(doc).await.unwrap();
}
```

## 3. Hybrid Search & RRF

Cerebro automatically performs Hybrid Search (Full-Text + Vector) when using the `PgVectorStore`, merging results via **Reciprocal Rank Fusion (RRF)** for maximum accuracy.

```rust
// The query automatically triggers the hybrid merge logic
let results = engine.query("neural architecture", 5).await.unwrap();
```

## 4. Advanced Features

### Consolidation (The "Sleep Cycle" & Holographic Compression)
Cerebro runs a background consolidation worker that autonomously prunes dead memory and refines the vector index during idle periods. With the Advanced Cognitive Architecture features, it also performs **Holographic Memory Compression**, distilling thousands of old episodic memories into dense "Axioms".

### Optional Feature Flags
To keep your binary lean, most integrations are opt-in:
- `local_models`: Enable `candle`-based local inference.
- `qdrant`: Enable distributed Qdrant storage.
- `pdf`: Enable PDF extraction via `pdf-extract`.
- `graph`: Enable Neo4j Knowledge Graph persistence and Graphify abstractions.
- `spatial`: Enable 3D spatial semantic navigation.
- `python` / `wasm`: Enable FFI bindings.

## 5. FFI & Language Support

### Python
If compiled with the `python` feature, you can use Cerebro directly in your Python apps:
```python
import cerebro
engine = cerebro.Cerebro()
engine.ingest("Some data...")
```

### WASM
Targeting the browser or Edge workers? Compile with `--features wasm` to get native JS bindings.
```javascript
import init, { CerebroWasm } from './pkg/cerebro.js';
const engine = new CerebroWasm();
```

## 6. SwarmForge — Multi-Agent Swarming

SwarmForge turns Cerebro into a multi-agent orchestration engine. Agents collaborate through Cerebro's three memory tiers:

- **Working Memory** (`KVStore`) — fast per-agent state
- **Semantic Memory** (`MemoryEngine`) — agents commit outputs as searchable documents
- **Episodic Memory** — per-agent conversation histories

### Setting Up a Swarm

```rust
use cerebro::prelude::*;
use cerebro::swarm::prelude::*;
use std::sync::Arc;

// 1. Build the Cerebro memory bus
let engine = Arc::new(MemoryEngine::new(
    Arc::new(RecursiveCharacterChunker::new(512, 50)),
    Arc::new(MockEmbedder::new(8)),
    Arc::new(MemoryVectorStore::new()),
));
let memory = Arc::new(CerebroMemoryBus::new(engine, Arc::new(MemoryKVStore::new())));

// 2. Create orchestrator
let mut orch = SwarmOrchestrator::new(memory);
```

### Registering Agents with Any LLM

Each agent can use a different LLM provider:

```rust
// Agent using local Ollama
orch.register_agent(AgentConfig {
    id: "researcher".into(),
    name: "Research Agent".into(),
    system_prompt: "You research topics thoroughly.".into(),
    model: LlmProvider::Ollama { model: "llama3".into(), base_url: "http://localhost:11434".into() },
    tools: vec![], handoff_targets: vec![], max_steps: 10,
});

// Agent using Anthropic Claude
orch.register_agent(AgentConfig {
    id: "writer".into(),
    name: "Writer Agent".into(),
    system_prompt: "You write clear, compelling content.".into(),
    model: LlmProvider::Anthropic {
        model: "claude-sonnet-4-20250514".into(),
        api_key: std::env::var("ANTHROPIC_API_KEY").unwrap(),
        max_tokens: 4096,
    },
    tools: vec![], handoff_targets: vec![], max_steps: 10,
});

// Agent using Groq (via OpenAI-compatible)
orch.register_agent(AgentConfig {
    id: "reviewer".into(),
    name: "Review Agent".into(),
    system_prompt: "You review content for quality.".into(),
    model: LlmProvider::OpenAICompatible {
        model: "llama-3.3-70b-versatile".into(),
        api_key: std::env::var("GROQ_API_KEY").unwrap(),
        base_url: "https://api.groq.com/openai/v1".into(),
        provider_name: Some("groq".into()),
    },
    tools: vec![], handoff_targets: vec![], max_steps: 10,
});
```

### Executing Swarm Patterns

#### Sequential Pipeline
```rust
let result = orch.execute(
    SwarmPattern::Sequential {
        agent_order: vec!["researcher".into(), "writer".into(), "reviewer".into()],
    },
    "Write an article about Rust's memory safety model",
).await.unwrap();
```

#### Parallel Fan-Out / Fan-In
```rust
let result = orch.execute(
    SwarmPattern::Parallel {
        agents: vec!["security".into(), "perf".into(), "style".into()],
        merger: "synthesizer".into(),
    },
    "Review this code for all issues",
).await.unwrap();
```

#### Hierarchical Supervisor
```rust
let result = orch.execute(
    SwarmPattern::Hierarchical {
        supervisor: "lead".into(),
        workers: vec!["backend".into(), "frontend".into(), "testing".into()],
    },
    "Plan the implementation of user authentication",
).await.unwrap();
```

### Tool Calling & ReAct
Agents can interact with external systems using tools dynamically executed by the orchestrator:

```rust
let my_tool = ... // Implements AgentTool
orch.register_tool(my_tool);

orch.register_agent(AgentConfig {
    id: "researcher".into(),
    name: "Research Agent".into(),
    system_prompt: "Research topics thoroughly.".into(),
    model: LlmProvider::Ollama { model: "llama3".into(), base_url: "http://localhost:11434".into() },
    tools: vec![ToolDefinition { name: "search".into(), description: "Search the web".into(), parameters_schema: json!({}) }],
    handoff_targets: vec![], max_steps: 10,
});
```

### CLI & HTTP Gateway
If you compile with the `cli` and `api` features, you gain access to the standalone gateway binary `cerebro-swarm`:

```bash
cargo run --bin cerebro-swarm --features api,cli -- serve --port 3000
```
This boots an Axum gateway at `POST /api/swarm/execute` which accepts JSON payloads to trigger patterns remotely.

### Supported LLM Providers

| Provider | Variant | Covers |
|---|---|---|
| Ollama | `LlmProvider::Ollama` | Any local model (Llama 3, Mistral, Phi, Gemma, etc.) |
| OpenAI | `LlmProvider::OpenAI` | GPT-4o, GPT-4, o3, etc. |
| Gemini | `LlmProvider::Gemini` | Gemini Pro, Flash, Ultra |
| Anthropic | `LlmProvider::Anthropic` | Claude 4, Sonnet, Haiku |
| Any OpenAI-compatible | `LlmProvider::OpenAICompatible` | Groq, Together, Mistral, DeepSeek, Fireworks, LM Studio, vLLM, Anyscale, etc. |

---
*Author: Suraj Kumar Nanda* | [Surajkumarnanda.com](https://Surajkumarnanda.com)