cerebro 1.1.4

A high-performance semantic memory engine for AI Agents, now featuring SwarmForge for built-in multi-agent orchestration.
Documentation
# Cerebro Architecture

## The Overall Data Pipeline

```mermaid
graph TD
    A[Raw Document / URL / Event] --> B(Chunker)
    C[OpenAI / LLM App] -->|MCP / Rust API| API(MemoryEngine)
    
    B -->|Chunks| API
    API --> D{Router & Compute Engine}
    
    D <-->|Traits: Embedder| E1[Local Rust ML: Candle/ORT]
    D <-->|Traits: Embedder| E2[Remote APIs: OpenAI/Gemini]
    
    D -->|Semantic Vectors| F1[(PgVectorStore)]
    D -->|Working State| F2[(MemoryVectorStore)]

    subgraph SwarmForge["SwarmForge Orchestration"]
        ORCH[SwarmOrchestrator] -->|Pattern| SEQ[Sequential Pipeline]
        ORCH -->|Pattern| PAR[Parallel Fan-Out]
        ORCH -->|Pattern| HIE[Hierarchical Supervisor]
        
        BUS[CerebroMemoryBus] --> API
        BUS --> KV[(KVStore - Working Memory)]
        BUS --> EP[(Episodic Memory)]
        
        SEQ & PAR & HIE --> BUS
        SEQ & PAR & HIE --> LLM[LLM Client]
        LLM --> L1[Ollama]
        LLM --> L2[OpenAI]
        LLM --> L3[Gemini]
        LLM --> L4[Anthropic]
        LLM --> L5[OpenAI-Compatible]
    end
```

## Module Structure

Cerebro is organized as a single crate with clean module boundaries:

### `models` 
Core data structures: `Document`, `Chunk`, `Node`, `Metadata`. All fully serializable via Serde.

### `traits`
The universal trait system that all backends implement:
* `Chunker` — splits Documents into Chunks
* `Embedder` — converts text into vector embeddings
* `VectorStore` — persists and searches Nodes
* `KVStore` — fast key-value state for Working Memory
* `CerebroError` — unified error hierarchy

### `chunker`
* `RecursiveCharacterChunker` — character-boundary-safe text splitter.
* `HtmlSemanticChunker` — layout-aware semantic splitter.

### `compute`
Embedding providers:
- `MockEmbedder` — deterministic dummy embeddings for offline testing.
- `OpenAIEmbedder` — remote OpenAI API integration.
- `LocalEmbedder` — native CPU inference using HuggingFace's `candle`.
- `AnthropicVoyageEmbedder` — Claude-aligned vectors through Voyage AI.

### `storage`
Vector store backends:
- `MemoryVectorStore` — fast in-memory concurrent storage.
- `PgVectorStore` — persistent storage using PostgreSQL and pgvector (Hybrid search support).
- `QdrantVectorStore` — high-volume distributed Qdrant driver.
- `MemoryKVStore` — fast key-value state for working memory.

### `engine`
The core orchestration layer:
* `MemoryEngine` — coordinates the primary ingest/query flows.
* `ConsolidationWorker` — background Tokio task that prunes memory and optimizes semantic density.
* `GraphMemoryLayer` — bridge for Neo4j/Cypher entity persistence.

### `swarm` — Multi-Agent Swarming Engine (SwarmForge)
The agent orchestration layer, powered by Cerebro's three-tier memory:

* **`agent`** — Agent definitions: `AgentConfig`, `LlmProvider`, `ChatMessage`, `AgentRuntime`. Supports per-agent system prompts, tool definitions, handoff rules, and circuit breakers.
* **`memory_bus`**`CerebroMemoryBus`: The bridge that maps Cerebro's memory primitives to swarm-agent needs:
  - *Working Memory* (`KVStore`) — namespaced per-agent state (`agent:{id}:{key}`)
  - *Semantic Memory* (`MemoryEngine`) — agents commit outputs as Documents, other agents recall via vector search
  - *Episodic Memory* — per-agent conversation histories within a run
* **`llm`**`LlmClient`: Unified multi-provider LLM client supporting Ollama, OpenAI, Gemini, Anthropic (Claude), and any OpenAI-compatible API (Groq, Together, Mistral, DeepSeek, LM Studio, vLLM, etc.)
* **`orchestrator`**`SwarmOrchestrator`: Manages agent registration, pattern selection, and execution coordination.
* **`patterns`** — Orchestration topologies:
  - `SequentialPattern` — Pipeline: Agent A → B → C, each output feeding the next
  - `ParallelPattern` — Fan-out/fan-in: N agents work simultaneously, merger synthesizes
  - `HierarchicalPattern` — Supervisor decomposes tasks, delegates to workers, synthesizes results
* **`trace`**`ExecutionTracer`: Full audit trail of every LLM call, handoff, memory query, and tool invocation with timestamps, token counts, and latency metrics.

### `ingest`
Specialized document extractors:
- `PdfIngestor` — extracts text from raw PDF bytes.

### `ffi`
Cross-language interface bridge:
- `python` — PyO3 bindings.
- `wasm` — Wasm-bindgen targets.

---
*Author: Suraj Kumar Nanda* | [Surajkumarnanda.com]https://Surajkumarnanda.com