mcp-memory

A Model Context Protocol (MCP) server providing LLM agents with a persistent knowledge graph memory — entities, relations, and observations stored in an embedded SQLite database with FTS5 full-text search.

Speaks MCP over stdio, TCP, and HTTP transports.

                    ┌──────────────────────────────────────────────┐
                    │              mcp-memory server               │
                    │                                              │
     ┌───────┐      │  ┌──────────┐   ┌───────────────────────┐   │
     │Claude │──────│─>│  stdio /  │──>│ GraphHandle           │   │
     │Desktop│      │  │  TCP /   │   │  ├ LRU entity cache    │   │
     └───────┘      │  │  HTTP    │   │  ├ FxHashMap name→ID   │   │
                    │  └──────────┘   │  ├ FTS5 full-text idx  │   │
                    │         │       │  └──→ SQLite ──→       │   │
                    │         v       └──────────┬──────────────┘   │
                    │  ┌─────────────────────────┴─────────────┐   │
                    │  │  SQLite (WAL mode, 16 KB pages)        │   │
                    │  │  entity, observation, relation,        │   │
                    │  │  name_fts, obs_fts, type_dict          │   │
                    │  └───────────────────────────────────────┘   │
                    └──────────────────────────────────────────────┘

Installation

cargo install mcp-memory

Quick start

mcp-memory --transport stdio

The database path is resolved in order:

--memory-file / -f flag
MEMORY_FILE_PATH environment variable
Default: memory.mcpmem in the working directory

Transports

Transport	Flag	Description
stdio	`--transport stdio`	Newline-delimited JSON over stdin/stdout (default, for Claude Desktop / Claude Code)
tcp	`--transport tcp --bind 0.0.0.0:8080`	Newline-delimited JSON over TCP, concurrent connections
http	`--transport http --bind 0.0.0.0:8080`	MCP Streamable HTTP (POST/GET `/mcp`)

Claude Desktop config

{
  "mcpServers": {
    "memory": {
      "command": "mcp-memory"
    }
  }
}

Claude Code config

{
  "mcpServers": {
    "memory": {
      "command": "mcp-memory"
    }
  }
}

Authentication

The tcp and http transports accept an optional bearer token (stdio is never authenticated). Set it with --auth-token, --auth-token-file (trimmed; an empty file is rejected), or MCP_MEMORY_AUTH_TOKEN:

mcp-memory --transport tcp --bind 0.0.0.0:8080 --auth-token "s3cr3t"
mcp-memory --transport http --bind 0.0.0.0:8080 --auth-token "s3cr3t"

Binding a non-loopback address without a token exposes the entire graph to the network. Comparison is constant-time.

MCP Compliance

Implements the Model Context Protocol revision 2025-11-25 over JSON-RPC 2.0, via stdio, TCP, or HTTP.

Area	Support
Transports	stdio, TCP, Streamable HTTP (POST/GET `/mcp`, SSE)
Protocol version	`2025-11-25`, negotiates down to `2025-06-18` / `2025-03-26` / `2024-11-05`
`initialize`	version negotiation + `instructions`
`tools/list`, `tools/call`	26 tools
`CallToolResult`	`content[]` + `isError`
Auth	optional bearer token on TCP/HTTP (constant-time)
Capabilities advertised	`tools` only

Tool failures are returned as CallToolResults with isError: true (not as JSON-RPC protocol errors) so the model can self-correct.

Data model

Entity(name, entityType, observations[])
  |                          |
  |  —— relationType ——→   |
  v                          v
Entity(name, entityType, observations[])

Entity — a named node with a type (e.g. person, company, project) and free-form observation strings.
Relation — a directed edge (from, to, relationType) between two entities. Relations are undirected in traversal (BFS follows both ways).
Observation — an unstructured fact attached to an entity.

Search uses FTS5 full-text indexing with unicode61 remove_diacritics 2 tokenization. Name and observation bodies live in separate FTS5 virtual tables (name_fts, obs_fts) with external content referencing the core tables.

Data structures & performance

Storage engine: SQLite (WAL mode)

A single SQLite database in WAL mode with the following schema:

Table	Key	Purpose
`entity`	`INTEGER PRIMARY KEY` (rowid)	Primary entity storage; materialized `obs_count`, `out_deg`, `in_deg`; `name_hash` for O(1) routing
`observation`	`entity_id` (FK) + rowid	1:N observations per entity
`relation`	composite indexes	Directed edges; covering indexes `rel_out(from_id,type_id,to_id)` and `rel_in(to_id,type_id,from_id)` for index-only scans
`name_fts`	`content_rowid`	External-content FTS5 over `entity.name`
`obs_fts`	`content_rowid`	External-content FTS5 over `observation.body`
`type_dict`	name	Interned entity/relation types with live counts (loaded into RAM)
`graph_stat`	key (singleton)	`WITHOUT ROWID` counters: entities, relations, observations, entity_seq, obs_seq
`hub_degree`	entity_id	Degree spill for high-degree hubs
`partition_map`	entity_id	Reserved for future entity-type partitioning

Key SQLite pragmas: page_size=16384, journal_mode=WAL, synchronous=NORMAL, cache_size=-50000 (~50 MB), mmap_size=256 MB, temp_store=MEMORY, busy_timeout=5000.

In-memory caches (GraphHandle)

Cache	Size	Purpose
Entity LRU	10,000 entries	Avoids deserializing hot entities; stores `EntityMeta{id, type_id, obs_count, out_deg, in_deg}`
Name hash FxHashMap	all loaded	O(1) name-to-ID resolution via 64-bit FNV-1a hash
Prepared statement cache	SQLite internal	Reuses compiled queries

Write batching

Every mutation goes through a layered write path:

Existence checks — batch-read entity existence in one read transaction
Batch commit — all new entities/relations written in one write transaction
Batch index — all FTS entries updated in one write transaction
Cache invalidation — LRU entries for affected names are evicted

This reduces transaction count from O(N) to O(1) per create_entities/create_relations call.

Durability

Mode	Behavior	Data loss window
`async` (default)	Flush to kernel page cache, background sync	Up to ~1 second on power failure
`sync`	fsync before every write	Zero

Set via MCP_MEMORY_DURABILITY=sync.

Background maintenance

A background tokio task runs every 5 minutes and performs WAL checkpointing (PRAGMA wal_checkpoint(TRUNCATE)), query planner analysis (PRAGMA optimize), and FTS optimization.

Benchmarks

Measured end-to-end via the bench binary. 1,000 entities + 200 relations pre-populated. MacBook Pro (M4 Pro, 24 GB).

Run cargo run --release --bin bench on your target hardware.

Operation	Avg latency	Notes
`get_entity` (cache hit)	~20 µs	LRU hit; no SQLite I/O
`search_nodes` (name match)	~25 µs	FTS5 query + entity lookup
`open_nodes` (single)	~30 µs	LRU + SQLite
`open_nodes` (5 names)	~60 µs	Batch fetch
`neighbors` depth=1	~30 µs	Index-only scan via covering index
`neighbors` depth=2	~55 µs	Two-hop traversal
`find_path` (BFS)	~650 µs	Worst case: target not found, full BFS
`describe_entity`	~30 µs	Entity + incident relations
`graph_stats`	~15 µs	RAM counters (graph_stat table)
`read_graph` (all)	~1500 µs	Full dump: all entities + relations
`create_entities` (1000)	~2000 µs	Batch write + FTS index
`create_relations` (999)	~1200 µs	Batch write + degree updates
`find_all_paths` (A→C, depth 5)	~100 µs	Bounded DFS
`export_graph` (JSON)	~600 µs	Serialize all entities + relations
`entity_type_counts`	~10 µs	RAM-cached type dictionary
`degree` (cache hit)	~2 µs	Materialized column
`entities_exist` (10 names)	~15 µs	Hash lookup via FxHashMap

Tools

Write tools

create_entities — batch create, skips existing names
create_relations — batch create, skips missing entities and duplicates
add_observations — append to entity, deduplicates
delete_entities — cascade deletes incident relations
delete_observations — remove specific observations
delete_relations — remove exact (from, to, type) tuples
upsert_entities — create or merge (type preserved, observations unioned)
merge_entities — source → target redirect with full dedup
compact — trigger incremental vacuum + FTS optimize

Read tools

read_graph — dump all entities + relations (with optional type filter, offset, limit)
search_nodes — FTS5-ranked search over names, types, observations (with optional type filter)
open_nodes — fetch specific entities by name (with their relations)
batch_get_entities — bulk entity fetch (order preserved, null for missing)
get_entity — single entity by name
entity_exists — cheap existence check (hash lookup, no observation bodies fetched)
graph_stats — entity count, relation count, total observations
search_relations — filter by from/to/type
describe_entity — entity + incident relations + neighbors + degree
degree — number of incident relations by direction (outgoing / incoming / both)
find_path — BFS shortest path (undirected)
find_all_paths — DFS all simple paths (bounded by maxDepth, maxPaths)
extract_subgraph — BFS around seed entities to given depth
get_neighbors — entity neighbors with direction + type + depth filters
list_entity_types — type → count, ranked
list_relation_types — type → count, ranked
export_graph — JSON, Mermaid, or Graphviz DOT

Architecture

main.rs
  │
  ├── MCPServer::run_stdio()   — stdio transport (newline-delimited JSON-RPC)
  ├── MCPServer::run_tcp()     — TCP transport (same framing, concurrent conns)
  └── MCPServer::run_http()    — MCP Streamable HTTP (axum, POST/GET /mcp)
        │
        └── process_request()
              │
              ├── "initialize"     → protocol version + capabilities
              ├── "tools/list"     → cached from tools.rs
              ├── "tools/call"     → dispatches to handler by name
              ├── "ping"           → null
              └── "notifications/" → no reply

All three transports share process_value() / dispatch_line() / dispatch_http_body() — the dispatch core is transport-agnostic.

Locking

GraphHandle uses parking_lot::Mutex for the SQLite connection and LRU caches
All GraphHandle methods take &self — internal Mutex handles mutation
Tokio multi-thread runtime handles concurrent requests
SQLite WAL mode allows concurrent readers + one writer
Heavy dispatch (graph lock + optional fsync) is offloaded to tokio::task::spawn_blocking

Write path

create_entities([e1, e2, ...])
  1. Batch-check existence (FxHashMap hash lookup)
  2. Batch-insert entities (one write txn)
  3. Batch-index FTS (one write txn for name_fts)
  4. Invalidate LRU caches
  5. Update type_dict counts

The same batching pattern applies to create_relations (with degree updates).

Storage (SQLite)

SQLite provides the storage layer with:

WAL mode — concurrent readers + one writer without blocking readers
16 KB pages — shallower B-trees for faster lookups
FTS5 — full-text search with unicode61 remove_diacritics 2 tokenization
mmap — up to 256 MB of the database mapped for faster reads
Covering indexes — rel_out and rel_in enable index-only neighbor scans
Materialized counters — obs_count, out_deg, in_deg, type_dict.count, graph_stat are writer-maintained for O(1) reads
External-content FTS5 — avoids duplicating text; stable INTEGER PRIMARY KEY ensures content_rowid correctness across VACUUM

Concurrency model

TCP connections limited to 128 concurrent connections
Mutating operations acquire GraphHandle lock and serialize through SQLite
Read operations can proceed concurrently under WAL mode
Background maintenance runs every 5 minutes as a tokio task

Request size limits

Parameter	Limit
Max request body	16 MB
Name max bytes	1024
Observation max bytes	65,536
Max entities per request	1,000
Max relations per request	1,000
Max observations per entity	1,000
Max names per request	1,000
Max search limit	1,000
Max neighbor depth	16
Max relation search results	1,000
Max find_all_paths depth	10
Max find_all_paths results	100

Development

cargo test             # unit + integration + fuzzy (300+ tests)
cargo clippy --all-targets
cargo build --release  # LTO + fat, panic=abort, strip
cargo run --release --bin bench  # standalone benchmark

The test suite includes:

Unit tests — protocol, tools, config, error codes
Integration tests — CRUD persistence, search, paths, export, concurrency, all 26 tool handlers, invariants
Fuzzy tests — randomized CRUD sequences asserting graph invariants

Versioning & Compatibility

Follows Semantic Versioning. The current line is 2.x, targeting MCP revision 2025-11-25.

mcp-memory	MCP revision (default)	Negotiates
2.x	`2025-11-25`	`2025-06-18`, `2025-03-26`, `2024-11-05`
≤ 1.x	`2024-11-05`	—

License

Licensed under the Apache License, Version 2.0.

mcp-memory 3.0.0