mcp-memory 3.0.0

MCP server for knowledge graph memory — entities, relations, and observations persisted in SQLite with FTS5 search
Documentation

mcp-memory

A Model Context Protocol (MCP) server providing LLM agents with a persistent knowledge graph memory — entities, relations, and observations stored in an embedded SQLite database with FTS5 full-text search.

Speaks MCP over stdio, TCP, and HTTP transports.

                    ┌──────────────────────────────────────────────┐
                    │              mcp-memory server               │
                    │                                              │
     ┌───────┐      │  ┌──────────┐   ┌───────────────────────┐   │
     │Claude │──────│─>│  stdio /  │──>│ GraphHandle           │   │
     │Desktop│      │  │  TCP /   │   │  ├ LRU entity cache    │   │
     └───────┘      │  │  HTTP    │   │  ├ FxHashMap name→ID   │   │
                    │  └──────────┘   │  ├ FTS5 full-text idx  │   │
                    │         │       │  └──→ SQLite ──→       │   │
                    │         v       └──────────┬──────────────┘   │
                    │  ┌─────────────────────────┴─────────────┐   │
                    │  │  SQLite (WAL mode, 16 KB pages)        │   │
                    │  │  entity, observation, relation,        │   │
                    │  │  name_fts, obs_fts, type_dict          │   │
                    │  └───────────────────────────────────────┘   │
                    └──────────────────────────────────────────────┘

Installation

cargo install mcp-memory

Quick start

mcp-memory --transport stdio

The database path is resolved in order:

  1. --memory-file / -f flag
  2. MEMORY_FILE_PATH environment variable
  3. Default: memory.mcpmem in the working directory

Transports

Transport Flag Description
stdio --transport stdio Newline-delimited JSON over stdin/stdout (default, for Claude Desktop / Claude Code)
tcp --transport tcp --bind 0.0.0.0:8080 Newline-delimited JSON over TCP, concurrent connections
http --transport http --bind 0.0.0.0:8080 MCP Streamable HTTP (POST/GET /mcp)

Claude Desktop config

{
  "mcpServers": {
    "memory": {
      "command": "mcp-memory"
    }
  }
}

Claude Code config

{
  "mcpServers": {
    "memory": {
      "command": "mcp-memory"
    }
  }
}

Authentication

The tcp and http transports accept an optional bearer token (stdio is never authenticated). Set it with --auth-token, --auth-token-file (trimmed; an empty file is rejected), or MCP_MEMORY_AUTH_TOKEN:

mcp-memory --transport tcp --bind 0.0.0.0:8080 --auth-token "s3cr3t"
mcp-memory --transport http --bind 0.0.0.0:8080 --auth-token "s3cr3t"

Binding a non-loopback address without a token exposes the entire graph to the network. Comparison is constant-time.

MCP Compliance

Implements the Model Context Protocol revision 2025-11-25 over JSON-RPC 2.0, via stdio, TCP, or HTTP.

Area Support
Transports stdio, TCP, Streamable HTTP (POST/GET /mcp, SSE)
Protocol version 2025-11-25, negotiates down to 2025-06-18 / 2025-03-26 / 2024-11-05
initialize version negotiation + instructions
tools/list, tools/call 26 tools
CallToolResult content[] + isError
Auth optional bearer token on TCP/HTTP (constant-time)
Capabilities advertised tools only

Tool failures are returned as CallToolResults with isError: true (not as JSON-RPC protocol errors) so the model can self-correct.

Data model

Entity(name, entityType, observations[])
  |                          |
  |  —— relationType ——→   |
  v                          v
Entity(name, entityType, observations[])
  • Entity — a named node with a type (e.g. person, company, project) and free-form observation strings.
  • Relation — a directed edge (from, to, relationType) between two entities. Relations are undirected in traversal (BFS follows both ways).
  • Observation — an unstructured fact attached to an entity.

Search uses FTS5 full-text indexing with unicode61 remove_diacritics 2 tokenization. Name and observation bodies live in separate FTS5 virtual tables (name_fts, obs_fts) with external content referencing the core tables.

Data structures & performance

Storage engine: SQLite (WAL mode)

A single SQLite database in WAL mode with the following schema:

Table Key Purpose
entity INTEGER PRIMARY KEY (rowid) Primary entity storage; materialized obs_count, out_deg, in_deg; name_hash for O(1) routing
observation entity_id (FK) + rowid 1:N observations per entity
relation composite indexes Directed edges; covering indexes rel_out(from_id,type_id,to_id) and rel_in(to_id,type_id,from_id) for index-only scans
name_fts content_rowid External-content FTS5 over entity.name
obs_fts content_rowid External-content FTS5 over observation.body
type_dict name Interned entity/relation types with live counts (loaded into RAM)
graph_stat key (singleton) WITHOUT ROWID counters: entities, relations, observations, entity_seq, obs_seq
hub_degree entity_id Degree spill for high-degree hubs
partition_map entity_id Reserved for future entity-type partitioning

Key SQLite pragmas: page_size=16384, journal_mode=WAL, synchronous=NORMAL, cache_size=-50000 (~50 MB), mmap_size=256 MB, temp_store=MEMORY, busy_timeout=5000.

In-memory caches (GraphHandle)

Cache Size Purpose
Entity LRU 10,000 entries Avoids deserializing hot entities; stores EntityMeta{id, type_id, obs_count, out_deg, in_deg}
Name hash FxHashMap all loaded O(1) name-to-ID resolution via 64-bit FNV-1a hash
Prepared statement cache SQLite internal Reuses compiled queries

Write batching

Every mutation goes through a layered write path:

  1. Existence checks — batch-read entity existence in one read transaction
  2. Batch commit — all new entities/relations written in one write transaction
  3. Batch index — all FTS entries updated in one write transaction
  4. Cache invalidation — LRU entries for affected names are evicted

This reduces transaction count from O(N) to O(1) per create_entities/create_relations call.

Durability

Mode Behavior Data loss window
async (default) Flush to kernel page cache, background sync Up to ~1 second on power failure
sync fsync before every write Zero

Set via MCP_MEMORY_DURABILITY=sync.

Background maintenance

A background tokio task runs every 5 minutes and performs WAL checkpointing (PRAGMA wal_checkpoint(TRUNCATE)), query planner analysis (PRAGMA optimize), and FTS optimization.

Benchmarks

Measured end-to-end via the bench binary. 1,000 entities + 200 relations pre-populated. MacBook Pro (M4 Pro, 24 GB).

Run cargo run --release --bin bench on your target hardware.

Operation Avg latency Notes
get_entity (cache hit) ~20 µs LRU hit; no SQLite I/O
search_nodes (name match) ~25 µs FTS5 query + entity lookup
open_nodes (single) ~30 µs LRU + SQLite
open_nodes (5 names) ~60 µs Batch fetch
neighbors depth=1 ~30 µs Index-only scan via covering index
neighbors depth=2 ~55 µs Two-hop traversal
find_path (BFS) ~650 µs Worst case: target not found, full BFS
describe_entity ~30 µs Entity + incident relations
graph_stats ~15 µs RAM counters (graph_stat table)
read_graph (all) ~1500 µs Full dump: all entities + relations
create_entities (1000) ~2000 µs Batch write + FTS index
create_relations (999) ~1200 µs Batch write + degree updates
find_all_paths (A→C, depth 5) ~100 µs Bounded DFS
export_graph (JSON) ~600 µs Serialize all entities + relations
entity_type_counts ~10 µs RAM-cached type dictionary
degree (cache hit) ~2 µs Materialized column
entities_exist (10 names) ~15 µs Hash lookup via FxHashMap

Tools

Write tools

  • create_entities — batch create, skips existing names
  • create_relations — batch create, skips missing entities and duplicates
  • add_observations — append to entity, deduplicates
  • delete_entities — cascade deletes incident relations
  • delete_observations — remove specific observations
  • delete_relations — remove exact (from, to, type) tuples
  • upsert_entities — create or merge (type preserved, observations unioned)
  • merge_entities — source → target redirect with full dedup
  • compact — trigger incremental vacuum + FTS optimize

Read tools

  • read_graph — dump all entities + relations (with optional type filter, offset, limit)
  • search_nodes — FTS5-ranked search over names, types, observations (with optional type filter)
  • open_nodes — fetch specific entities by name (with their relations)
  • batch_get_entities — bulk entity fetch (order preserved, null for missing)
  • get_entity — single entity by name
  • entity_exists — cheap existence check (hash lookup, no observation bodies fetched)
  • graph_stats — entity count, relation count, total observations
  • search_relations — filter by from/to/type
  • describe_entity — entity + incident relations + neighbors + degree
  • degree — number of incident relations by direction (outgoing / incoming / both)
  • find_path — BFS shortest path (undirected)
  • find_all_paths — DFS all simple paths (bounded by maxDepth, maxPaths)
  • extract_subgraph — BFS around seed entities to given depth
  • get_neighbors — entity neighbors with direction + type + depth filters
  • list_entity_types — type → count, ranked
  • list_relation_types — type → count, ranked
  • export_graph — JSON, Mermaid, or Graphviz DOT

Architecture

main.rs
  │
  ├── MCPServer::run_stdio()   — stdio transport (newline-delimited JSON-RPC)
  ├── MCPServer::run_tcp()     — TCP transport (same framing, concurrent conns)
  └── MCPServer::run_http()    — MCP Streamable HTTP (axum, POST/GET /mcp)
        │
        └── process_request()
              │
              ├── "initialize"     → protocol version + capabilities
              ├── "tools/list"     → cached from tools.rs
              ├── "tools/call"     → dispatches to handler by name
              ├── "ping"           → null
              └── "notifications/" → no reply

All three transports share process_value() / dispatch_line() / dispatch_http_body() — the dispatch core is transport-agnostic.

Locking

  • GraphHandle uses parking_lot::Mutex for the SQLite connection and LRU caches
  • All GraphHandle methods take &self — internal Mutex handles mutation
  • Tokio multi-thread runtime handles concurrent requests
  • SQLite WAL mode allows concurrent readers + one writer
  • Heavy dispatch (graph lock + optional fsync) is offloaded to tokio::task::spawn_blocking

Write path

create_entities([e1, e2, ...])
  1. Batch-check existence (FxHashMap hash lookup)
  2. Batch-insert entities (one write txn)
  3. Batch-index FTS (one write txn for name_fts)
  4. Invalidate LRU caches
  5. Update type_dict counts

The same batching pattern applies to create_relations (with degree updates).

Storage (SQLite)

SQLite provides the storage layer with:

  • WAL mode — concurrent readers + one writer without blocking readers
  • 16 KB pages — shallower B-trees for faster lookups
  • FTS5 — full-text search with unicode61 remove_diacritics 2 tokenization
  • mmap — up to 256 MB of the database mapped for faster reads
  • Covering indexesrel_out and rel_in enable index-only neighbor scans
  • Materialized countersobs_count, out_deg, in_deg, type_dict.count, graph_stat are writer-maintained for O(1) reads
  • External-content FTS5 — avoids duplicating text; stable INTEGER PRIMARY KEY ensures content_rowid correctness across VACUUM

Concurrency model

  • TCP connections limited to 128 concurrent connections
  • Mutating operations acquire GraphHandle lock and serialize through SQLite
  • Read operations can proceed concurrently under WAL mode
  • Background maintenance runs every 5 minutes as a tokio task

Request size limits

Parameter Limit
Max request body 16 MB
Name max bytes 1024
Observation max bytes 65,536
Max entities per request 1,000
Max relations per request 1,000
Max observations per entity 1,000
Max names per request 1,000
Max search limit 1,000
Max neighbor depth 16
Max relation search results 1,000
Max find_all_paths depth 10
Max find_all_paths results 100

Development

cargo test             # unit + integration + fuzzy (300+ tests)
cargo clippy --all-targets
cargo build --release  # LTO + fat, panic=abort, strip
cargo run --release --bin bench  # standalone benchmark

The test suite includes:

  • Unit tests — protocol, tools, config, error codes
  • Integration tests — CRUD persistence, search, paths, export, concurrency, all 26 tool handlers, invariants
  • Fuzzy tests — randomized CRUD sequences asserting graph invariants

Versioning & Compatibility

Follows Semantic Versioning. The current line is 2.x, targeting MCP revision 2025-11-25.

mcp-memory MCP revision (default) Negotiates
2.x 2025-11-25 2025-06-18, 2025-03-26, 2024-11-05
≤ 1.x 2024-11-05

License

Licensed under the Apache License, Version 2.0.