Cerebro 🧠

A blazing-fast AI memory layer that enables teams of specialized agents to collaborate through a shared cognitive architecture.

Features

🚀 Minimal Overhead: Powered by a lean async pipeline in Rust, designed for high-scale agentic workloads.
🔌 Universal Storage: Trait-based backends — swap between MemoryVectorStore, PgVectorStore, or Qdrant.
🧠 Pluggable Compute: Route embeddings through local models (Candle) or remote APIs (OpenAI, Anthropic).
🔄 Active Consolidation: Background "Sleep Cycle" worker for autonomous memory pruning and semantic organization.
🔍 Hybrid Search: Native RRF (Reciprocal Rank Fusion) combining keyword and vector retrieval for highest precision.
🕸️ Graphify: LLM-powered dynamic entity extraction into Neo4j or in-memory Knowledge Graphs.
🧬 Advanced Cognitive Architecture: Built-in time-traveling event-sourced memory, swarm immunology, and 3D spatial semantic navigation.
🐝 SwarmForge: Multi-agent swarming engine with sequential, parallel, and hierarchical orchestration patterns.
🤖 Universal LLM: Supports Ollama, OpenAI, Gemini, Anthropic, and any OpenAI-compatible API.
🌐 MCP Ready: Native Model Context Protocol server (cerebro-mcp) for AI desktop apps.
🦀 Multi-Language: Native Python (PyO3) and WASM bindings.
📄 Complex Ingestion: PDF extraction and HTML-aware semantic chunking.

Getting Started

[dependencies]
cerebro = "1.1.8"

1. Working with the Memory Engine

Store and retrieve semantic memory using Reciprocal Rank Fusion (hybrid search).

use cerebro::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() {
    let engine = MemoryEngine::new(
        Arc::new(RecursiveCharacterChunker::new(512, 50)),
        Arc::new(MockEmbedder::new(1536)),
        Arc::new(MemoryVectorStore::new()),
    );

    engine.ingest_document(Document::new("Rust ensures memory safety.")).await.unwrap();

    let results = engine.query("memory safety", 1).await.unwrap();
    println!("Match: {}", results[0].0.chunk.text);
}

2. Building a Multi-Agent Swarm

Orchestrate a team of agents that share Cerebro's memory layer.

use cerebro::prelude::*;
use cerebro::swarm::prelude::*;
use std::sync::Arc;

#[tokio::main]
async fn main() {
    let engine = Arc::new(MemoryEngine::new(
        Arc::new(RecursiveCharacterChunker::new(512, 50)),
        Arc::new(MockEmbedder::new(8)),
        Arc::new(MemoryVectorStore::new()),
    ));
    let memory = Arc::new(CerebroMemoryBus::new(engine, Arc::new(MemoryKVStore::new())));
    
    let mut swarm = SwarmOrchestrator::new(memory);

    swarm.register_agent(AgentConfig {
        id: "analyst".into(),
        name: "Security Analyst".into(),
        system_prompt: "Analyze code for vulnerabilities.".into(),
        model: LlmProvider::Ollama { model: "llama3".into(), base_url: "http://localhost:11434".into() },
        tools: vec![], handoff_targets: vec![], max_steps: 10,
    });

    let result = swarm.execute(
        SwarmPattern::Sequential { agent_order: vec!["analyst".into()] },
        "Review this code snippet",
    ).await.unwrap();
}

3. Model Context Protocol (MCP)

Cerebro acts as a native Model Context Protocol server for your local IDE (Cursor, Claude Desktop).

Compile the server:

cargo build --release --bin cerebro-mcp

Add the generated binary to your IDE's MCP config:

{
  "mcpServers": {
    "cerebro": {
      "command": "<ABSOLUTE_PATH>/target/release/cerebro-mcp",
      "args": []
    }
  }
}

Supported LLM Providers

Provider	Config Variant	Target Use Cases
Ollama	`LlmProvider::Ollama`	Privacy-first local models (Llama 3, Mistral, Phi)
Anthropic	`LlmProvider::Anthropic`	Deep reasoning and coding (Claude 3.5 Sonnet)
OpenAI	`LlmProvider::OpenAI`	General agentic workflows (GPT-4o, o3)
Google Gemini	`LlmProvider::Gemini`	Multimodal data ingestion (Gemini 1.5 Pro)
Universal API	`LlmProvider::OpenAICompatible`	Fast inference engines (Groq, Together, vLLM)