# Cerebro Architecture
## The Overall Data Pipeline
```mermaid
graph TD
A[Raw Document / URL / Event] --> B(Chunker)
C[OpenAI / LLM App] -->|MCP / Rust API| API(MemoryEngine)
B -->|Chunks| API
API --> D{Router & Compute Engine}
D <-->|Traits: Embedder| E1[Local Rust ML: Candle/ORT]
D <-->|Traits: Embedder| E2[Remote APIs: OpenAI/Gemini]
D -->|Semantic Vectors| F1[(PgVectorStore)]
D -->|Working State| F2[(MemoryVectorStore)]
subgraph SwarmForge["SwarmForge Orchestration"]
ORCH[SwarmOrchestrator] -->|Pattern| SEQ[Sequential Pipeline]
ORCH -->|Pattern| PAR[Parallel Fan-Out]
ORCH -->|Pattern| HIE[Hierarchical Supervisor]
BUS[CerebroMemoryBus] --> API
BUS --> KV[(KVStore - Working Memory)]
BUS --> EP[(Episodic Memory)]
SEQ & PAR & HIE --> BUS
SEQ & PAR & HIE --> LLM[LLM Client]
LLM --> L1[Ollama]
LLM --> L2[OpenAI]
LLM --> L3[Gemini]
LLM --> L4[Anthropic]
LLM --> L5[OpenAI-Compatible]
end
```
## Module Structure
Cerebro is organized as a single crate with clean module boundaries:
### `models`
Core data structures: `Document`, `Chunk`, `Node`, `Metadata`. All fully serializable via Serde.
### `traits`
The universal trait system that all backends implement:
* `Chunker` — splits Documents into Chunks
* `Embedder` — converts text into vector embeddings
* `VectorStore` — persists and searches Nodes
* `KVStore` — fast key-value state for Working Memory
* `CerebroError` — unified error hierarchy
### `chunker`
* `RecursiveCharacterChunker` — character-boundary-safe text splitter.
* `HtmlSemanticChunker` — layout-aware semantic splitter.
### `compute`
Embedding providers:
- `MockEmbedder` — deterministic dummy embeddings for offline testing.
- `OpenAIEmbedder` — remote OpenAI API integration.
- `LocalEmbedder` — native CPU inference using HuggingFace's `candle`.
- `AnthropicVoyageEmbedder` — Claude-aligned vectors through Voyage AI.
### `storage`
Vector store backends:
- `MemoryVectorStore` — fast in-memory concurrent storage.
- `PgVectorStore` — persistent storage using PostgreSQL and pgvector (Hybrid search support).
- `QdrantVectorStore` — high-volume distributed Qdrant driver.
- `MemoryKVStore` — fast key-value state for working memory.
### `engine`
The core orchestration layer:
* `MemoryEngine` — coordinates the primary ingest/query flows.
* `ConsolidationWorker` — background Tokio task that prunes memory and optimizes semantic density.
* `GraphMemoryLayer` — bridge for Neo4j/Cypher entity persistence.
### `swarm` — Multi-Agent Swarming Engine (SwarmForge)
The agent orchestration layer, powered by Cerebro's three-tier memory:
* **`agent`** — Agent definitions: `AgentConfig`, `LlmProvider`, `ChatMessage`, `AgentRuntime`. Supports per-agent system prompts, tool definitions, handoff rules, and circuit breakers.
* **`memory_bus`** — `CerebroMemoryBus`: The bridge that maps Cerebro's memory primitives to swarm-agent needs:
- *Working Memory* (`KVStore`) — namespaced per-agent state (`agent:{id}:{key}`)
- *Semantic Memory* (`MemoryEngine`) — agents commit outputs as Documents, other agents recall via vector search
- *Episodic Memory* — per-agent conversation histories within a run
* **`llm`** — `LlmClient`: Unified multi-provider LLM client supporting Ollama, OpenAI, Gemini, Anthropic (Claude), and any OpenAI-compatible API (Groq, Together, Mistral, DeepSeek, LM Studio, vLLM, etc.)
* **`tools`** — `AgentTool`: Trait for implementing custom ReAct functions that agents can invoke mid-task.
* **`orchestrator`** — `SwarmOrchestrator`: Manages agent registration, tool registries, pattern selection, and execution coordination.
* **`patterns`** — Orchestration topologies:
- `SequentialPattern` — Pipeline: Agent A → B → C, each output feeding the next
- `ParallelPattern` — Fan-out/fan-in: N agents work simultaneously, merger synthesizes
- `HierarchicalPattern` — Supervisor decomposes tasks, delegates to workers, synthesizes results
- `templates` — `ReviewSwarmTemplate` and other out-of-the-box configured topographies.
* **`trace`** — `ExecutionTracer`: Full audit trail of every LLM call, handoff, memory query, and tool invocation with timestamps, token counts, and latency metrics.
* **`gateway`** — Axum-based HTTP (`POST /execute`) and WebSocket (`GET /stream`) gateway for monitoring swarms.
### `ingest`
Specialized document extractors:
- `PdfIngestor` — extracts text from raw PDF bytes.
### `ffi`
Cross-language interface bridge:
- `python` — PyO3 bindings.
- `wasm` — Wasm-bindgen targets.
### `bin`
Standalone applications:
- `cerebro-mcp` — Native Model Context Protocol server exposing Cerebro tools.
- `cerebro-swarm` — Command line interface and gateway server for SwarmForge.
---