Nexus Memory System

Nexus Memory System gives AI coding agents a shared memory, a bounded subconscious, and a practical way to stay context-aware across sessions without turning your machine into a distributed-systems hobby project.

It is a Rust-first, SQLite-backed memory runtime for tools like Claude Code, Codex, Gemini CLI, Qwen, Amp, OpenCode, Droid, and Hermes. Nexus captures useful activity, distills noise into signal, builds semantic recall, and runs bounded background dreaming so your agents can remember what matters and stop forgetting what they just learned.

If you want the short version: Nexus makes your agents feel sharper, steadier, and dramatically less forgetful without asking you to ship your workflow into an overbuilt memory platform.

Why People Notice Nexus

Always-on memory without requiring a heavyweight external stack
Automatic lifecycle capture through hooks, wrappers, and monitor-aware integrations
Representation-first recall built from explicit observations, session digests, semantic matches, derived insights, and contradictions
Bounded dreaming that reinforces patterns, detects conflicts, and compresses noisy activity into usable memory
Provider-flexible generation and embeddings with remote and local runtime options
One shared memory layer across CLI, web, MCP, and agent integrations

This is the part that matters: Nexus does not just log events. It turns activity into something an agent can actually use.

What Nexus Feels Like

Imagine a coding agent that can:

remember that you switched providers, fixed the installer, and validated the release
recall the right session digest instead of replaying a hundred raw tool events
notice that a newer memory contradicts an older assumption
answer with the memories it used and explain where they came from
keep doing all of that locally, with bounded runtime costs, while you keep working

That is the system this repo ships.

The Big Ideas

1. Capture what matters, not just the noise

Nexus hooks and wrappers capture lifecycle events, tool activity, and session context from supported agent environments. Low-signal operational events are not simply dumped into the main memory table forever. They can be buffered, distilled, summarized, and folded into higher-value session memory.

2. Turn raw activity into explicit memory

The cognition runtime derives explicit observations from raw activity, attaches evidence lineage, and stores cognitive metadata that higher-level recall can reason over.

3. Dream in bounded cycles

Nexus enables consolidation and reflection through dreaming. Dream cycles are bounded, replay-safe, and practical. They reinforce recurring truths, detect contradictions, refresh digests, and produce more retrieval-friendly memory than a plain event log ever could.

4. Recall with a working representation

Instead of searching raw text and hoping for the best, Nexus builds a working representation from:

recent explicit memories
vector-ranked semantic matches
session digests
derived insights
contradictions and conflict markers

That gives your agent a usable context window instead of a bag of unranked lines.

5. Stay flexible about models

Nexus supports remote and local generation and embeddings. You can:

use the same provider and same model for both
use the same provider with a different embedding model
use different providers for generation and embeddings
run local OpenAI-compatible backends through vLLM, LM Studio, or llama.cpp
keep local ONNX embeddings if you prefer

Quick Start

1. Clone, build, and install

git clone https://github.com/scooter-lacroix/Nexus-Memory-System.git
cd Nexus-Memory-System
cargo build --release -p nexus-memory
./scripts/install.sh --binary ./target/release/nexus

2. Initialize storage

nexus init

3. Store and recall a first memory

nexus store \
  --content "Release validation passed after fixing provider-backed embeddings" \
  --agent codex \
  --category session \
  --labels release,validation,embeddings

nexus recall --agent codex --query "What changed in the release validation work?"

4. Install hooks and wrappers

nexus hooks install --agent all
nexus hooks status --verbose

5. Inspect the subconscious

nexus represent --agent claude-code --query "provider rollout timeline" --introspect
nexus digest --agent claude-code --session-key <session-key>
nexus dream --agent claude-code

6. Configure providers and embeddings

nexus config
nexus config show

7. Start the API and dashboard

NEXUS_AGENT_ENABLED=true nexus serve --transport web --port 8768 --agent

Fast Examples

Ask Nexus what actually happened

nexus recall --agent claude-code --query "What changed in the installer and why?"

See the exact memory mix used for recall

nexus represent --agent claude-code --query "What changed in the installer and why?" --introspect

Pull the latest digest for a session

nexus digest latest --agent claude-code --session-key <session-key>

Run a manual dream cycle

nexus dream run --agent claude-code

Explain where a memory came from

nexus lineage show --memory-id <id>

What Ships

Cognition runtime

explicit derivation from raw activity
short and long session digests
bounded dreaming with contradiction handling and reinforcement
identity-aware, representation-first recall
introspection and lineage explanation

Multi-agent integration

native lifecycle integrations for Claude Code and the pi-family tools
wrapper-based lifecycle support for Codex, Amp, OpenCode, Droid, and Hermes
monitor-aware support for Gemini and Qwen
honest support-tier reporting so the system tells you what is truly installed and active

Retrieval and embeddings

vector-first semantic retrieval with bounded text fallback
provider-backed embeddings
local ONNX embeddings
local OpenAI-compatible runtimes
configurable provider/model inheritance between generation and embeddings

Operator tooling

CLI commands for list, recall, represent, digest, dream, lineage, session, and migration flows
MCP server access to the same cognition layer
web agent and observability routes
runtime health, digests, recall composition, and job visibility

Architecture At A Glance

Agent tools and clients
    |
    +-- nexus-cli
    +-- nexus-hooks
    +-- nexus-mcp
    +-- nexus-web
             |
             v
         nexus-core
             |
             +-- nexus-storage
             +-- nexus-vectors
             +-- nexus-embeddings
             +-- nexus-llm
             +-- nexus-orchestrator
             +-- nexus-agent

The shared domain model lives in nexus-core. The canonical store lives in nexus-storage. The cognition engine lives primarily in nexus-agent, powered by nexus-llm, nexus-embeddings, and nexus-vectors. CLI, hooks, MCP, and web all sit on top of the same memory runtime rather than building parallel silos.

For the full architectural walkthrough, see ARCHITECTURE.md.

Why Nexus Works

Many agent memory tools either:

store too little to be useful
store too much noise to be usable
depend on a heavy remote stack
or pretend retrieval is solved once a vector search returns ten strings

Nexus takes a different route:

keep the source of truth local and understandable
enrich memory gradually instead of pretending raw logs are knowledge
dream in bounded cycles instead of open-ended background churn
build a working representation for recall instead of a loose search result list
give operators visibility into what the system is doing

That combination is why the system feels more like a subconscious and less like a database wrapper.

Provider and Embedding Flexibility

Nexus supports generation through providers such as OpenAI, Anthropic, Gemini, OpenRouter, Groq, Z.ai, Minimax, and Mistral. Embeddings are independently configurable.

You can run:

remote generation + remote embeddings
remote generation + local embeddings
local generation + local embeddings
local generation + remote embeddings

That includes OpenAI-compatible local runtimes such as:

vLLM
LM Studio
llama.cpp

Documentation

Start here

Guides

Reference

Repository Layout

.
├── crates/
│   ├── nexus-agent/
│   ├── nexus-cli/
│   ├── nexus-core/
│   ├── nexus-embeddings/
│   ├── nexus-hooks/
│   ├── nexus-lephase/
│   ├── nexus-llm/
│   ├── nexus-mcp/
│   ├── nexus-orchestrator/
│   ├── nexus-storage/
│   ├── nexus-vectors/
│   └── nexus-web/
├── docs/
├── scripts/
├── Cargo.toml
└── Cargo.lock

LLM Provider Evaluation

The nexus eval command tests an LLM provider against the memory system's core operations: structured extraction, consolidation, and query synthesis. Each aspect is scored out of 200 for a total of 600.

Provider / Model	Extraction	Consolidation	Query	Total	Rating
OpenRouter / `arcee-ai/trinity-large-preview:free`	185	185	185	555 / 600	GOOD
Z.ai / `glm-4.5`	180	170	170	520 / 600	GOOD
Gemini / `gemini-3.1-flash-lite-preview`	100	80	55	235 / 600	EXPERIMENTAL
Groq / `moonshotai/kimi-k2-instruct-0905`	100	80	55	235 / 600	EXPERIMENTAL
Groq / `llama-3.3-70b-versatile`	160	155	150	465 / 600	ACCEPTABLE

Practical picks

Best overall value: OpenRouter / arcee-ai/trinity-large-preview:free
Strong structured extraction: Z.ai / glm-4.5
Best used selectively or after environment-specific validation: Gemini and Groq budget models

Validation

Recommended validation before opening a pull request:

cargo fmt --all --check
cargo clippy --workspace --all-targets -- -D warnings
cargo test --workspace
cargo bench -p nexus-memory-agent --bench cognition

License

This project is licensed under the MIT License.

nexus-memory-vectors 1.2.5