erinra 0.1.0-alpha.2

Memory MCP server for LLM coding assistants
docs.rs failed to build erinra-0.1.0-alpha.2
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.

Erinra

CI Crates.io License: MIT

A memory MCP server for LLM coding assistants. Single Rust binary, local SQLite storage, local embeddings, zero external dependencies.

Erinra stores, indexes, and retrieves memories via the Model Context Protocol (MCP) over stdio transport. The server handles storage and retrieval; the calling LLM makes all semantic decisions (deduplication, merging, corrections).

Features

  • Hybrid search -- vector similarity (sqlite-vec) + full-text search (FTS5), merged via Reciprocal Rank Fusion
  • Cross-encoder reranking (experimental) -- optional second-stage reranking with JINA Reranker for improved search precision
  • Local embeddings -- fastembed with Nomic Embed Text v1.5, no external API calls
  • Non-destructive -- store always creates, archive instead of delete, update is the only destructive operation
  • Fat responses -- store returns top-3 similar memories so the LLM can react without follow-up calls
  • Web dashboard -- browser-based UI for browsing, searching, and inspecting memories
  • Shared daemon -- multiple MCP sessions share one embedding model and database via a background daemon, saving ~288 MB per additional session
  • Multi-machine sync -- JSONL export/import with background sync via filesystem watching or polling
  • Single file -- all data in one SQLite database with WAL mode for concurrent access

Quick Start

Install from crates.io

cargo install erinra

Install from GitHub

cargo install --git https://github.com/alphaleonis/erinra-mcp.git

Or build from source

Requires Node.js (for building the web dashboard frontend).

git clone https://github.com/alphaleonis/erinra-mcp.git
cd erinra-mcp
cargo build --release

Add to Claude Code

claude mcp add erinra -s user -- erinra serve

The first run downloads the Nomic embedding model (~137 MB) to ~/.erinra/models/. Subsequent starts are fast.

Verify it works

erinra status

Open the web dashboard

erinra dash

MCP Tools

Tool Description
store Save a memory. Returns similar existing memories for dedup detection.
search Find memories by semantic similarity and keyword matching.
get Fetch full details of specific memories by ID.
list Browse and filter memories without a search query.
discover View taxonomy (projects, types, tags) with usage counts and stats.
context Batched session-start search: run multiple queries in one call with a content budget.
update Modify content or metadata. Re-embeds automatically on content change.
archive Soft-delete a memory. Excluded from search/list by default.
merge Combine multiple memories into one. Archives sources with supersedes links.
link Create a directed relationship between memories.
unlink Remove a relationship.

CLI Commands

Global options

erinra [OPTIONS] <COMMAND>

Options:
    --data-dir <PATH>    Data directory [default: ~/.erinra] [env: ERINRA_DATA_DIR]
    -h, --help           Print help
    -V, --version        Print version

serve -- Start MCP server

erinra serve [OPTIONS]

Starts the MCP server on stdio transport. When a web dashboard daemon is running, serve automatically uses relay mode -- bridging stdio to the daemon's HTTP endpoint instead of loading models locally. This saves ~288 MB of memory per additional session. Falls back to standalone mode if the daemon is unavailable.

Option Description
--log-level <LEVEL> Override log level (e.g. debug, info, erinra=debug)
--log-file <PATH> Override log file path
--busy-timeout <MS> Override SQLite busy timeout in milliseconds
--embedding-model <NAME> Override embedding model name
--reranker-model <NAME> Override reranker model name (also enables reranking)
--web Also start the web dashboard (background daemon)
--port <PORT> Override web server port (requires --web)
--bind <ADDR> Override web server bind address (requires --web)

dash -- Open web dashboard

erinra dash [OPTIONS]

Starts (or joins) the background daemon and opens the dashboard in a browser. The daemon process is ref-counted -- it stays alive while any serve --web or dash process is running, and shuts down automatically when all clients exit.

Option Description
--port <PORT> Override web server port
--bind <ADDR> Override web server bind address
--no-open Don't open the browser automatically
--open-only Open the browser and exit immediately (requires a running daemon)

export -- Export memories

erinra export <OUTPUT> [OPTIONS]
Option Description
--gzip Compress output with gzip
--since <TIMESTAMP> Only export memories created/updated after this timestamp

import -- Import memories

erinra import <INPUT>

Imports from JSONL file. Auto-detects gzip compression.

sync -- Run sync cycle

erinra sync [OPTIONS]
Option Description
--force Run even if sync is not enabled in config

reembed -- Regenerate embeddings

erinra reembed [OPTIONS]
Option Description
--model <NAME> Override the embedding model

status -- Print database stats

erinra status

models -- List available models

erinra models

Configuration

Erinra reads configuration from ~/.erinra/config.toml. All settings have sensible defaults.

Precedence: CLI args > environment variables > config file > defaults.

Embedding

[embedding]
model = "NomicEmbedTextV15Q"        # Embedding model name (run `erinra models` to list)
Env var Config key
ERINRA_EMBEDDING_MODEL embedding.model

Reranker (experimental)

Note: Reranking is experimental and subject to change. Short or single-keyword queries may receive unexpectedly low scores from the cross-encoder, causing results to be filtered out. Consider setting a low threshold (e.g. -5) or disabling threshold filtering if you encounter missing results.

Optional cross-encoder reranking for improved search precision. Downloads the reranker model (~151 MB) on first use.

[reranker]
enabled = false                      # Enable cross-encoder reranking
model = "JINARerankerV1TurboEn"      # Reranker model name
threshold = 0.0                      # Minimum score to include (can be negative)
Env var Config key
ERINRA_RERANKER_ENABLED reranker.enabled
ERINRA_RERANKER_MODEL reranker.model
ERINRA_RERANKER_THRESHOLD reranker.threshold

Store

[store]
similar_limit = 3                    # Similar memories returned by store/merge
similar_threshold = 0.5              # Minimum cosine similarity for similar results (0.0-1.0)
content_max_length = 500             # Content truncation for search/list responses (chars)
max_content_size = 10240             # Maximum content size on store/update (bytes)
Env var Config key
ERINRA_STORE_SIMILAR_LIMIT store.similar_limit
ERINRA_STORE_SIMILAR_THRESHOLD store.similar_threshold
ERINRA_STORE_CONTENT_MAX_LENGTH store.content_max_length
ERINRA_STORE_MAX_CONTENT_SIZE store.max_content_size

Search

[search]
rrf_k = 60                          # Reciprocal Rank Fusion constant
Env var Config key
ERINRA_SEARCH_RRF_K search.rrf_k

Database

[database]
busy_timeout = 5000                  # SQLite busy timeout (ms)
Env var Config key
ERINRA_DATABASE_BUSY_TIMEOUT database.busy_timeout

Logging

[logging]
log_level = "info"                   # Tracing filter directive
# log_file = "/path/to/erinra.log"  # Optional file logging (must be absolute path)
Env var Config key
ERINRA_LOG_LEVEL logging.log_level
ERINRA_LOG_FILE logging.log_file

Web Dashboard

[web]
port = 9898                          # Dashboard + MCP HTTP port
bind = "127.0.0.1"                   # Bind address
Env var Config key
ERINRA_WEB_PORT web.port
ERINRA_WEB_BIND web.bind

Sync

[sync]
enabled = false                      # Enable background sync in serve mode
sync_dir = "~/.erinra/sync"         # Directory for sync exports/imports
filename = "{hostname}"              # Export filename template
format = "jsonl.gz"                  # Export format: jsonl, jsonl.gz, json, json.gz
export_interval = 900                # Seconds between periodic exports
poll_interval = 0                    # 0 = filesystem watching, >0 = polling interval (seconds)
restore_on_start = false             # Import peer exports before MCP server starts
export_on_exit = false               # Run final export on shutdown
tombstone_retention_days = 90        # Purge tombstones older than this during export (days)
Env var Config key
ERINRA_SYNC_ENABLED sync.enabled
ERINRA_SYNC_DIR sync.sync_dir
ERINRA_SYNC_FILENAME sync.filename
ERINRA_SYNC_FORMAT sync.format
ERINRA_SYNC_EXPORT_INTERVAL sync.export_interval
ERINRA_SYNC_POLL_INTERVAL sync.poll_interval
ERINRA_SYNC_RESTORE_ON_START sync.restore_on_start
ERINRA_SYNC_EXPORT_ON_EXIT sync.export_on_exit
ERINRA_SYNC_TOMBSTONE_RETENTION_DAYS sync.tombstone_retention_days

The filename template supports the following placeholders:

Placeholder Resolves to Example
{hostname} Machine hostname my-desktop
{os} Operating system (linux, macos, windows) linux
{platform} Like {os} but detects WSL as wsl wsl
{distro} Linux distro ID from /etc/os-release (falls back to OS) fedora
{user} Current username alice

Placeholders can be combined (e.g. {hostname}-{os}). The resolved filename is sanitized to remove characters unsafe for filenames. The format extension (e.g. .jsonl.gz) is appended automatically.

Multi-Machine Sync

Erinra supports syncing memories between machines via a shared folder (e.g., Syncthing).

  1. Enable sync in config.toml:

    [sync]
    enabled = true
    sync_dir = "~/.erinra/sync"
    export_on_exit = true
    restore_on_start = true
    
  2. Share the sync directory between machines using Syncthing, Dropbox, or similar.

  3. Each machine exports to a machine-specific file (named by hostname) and imports others' exports. Conflict resolution uses last-write-wins (newer updated_at wins, local wins on tie). Tombstones propagate archives and deletes across machines.

Shared Daemon

When the web dashboard daemon is running, erinra serve automatically connects to it via HTTP relay instead of loading its own embedding model and reranker. This saves ~288 MB of memory per additional concurrent session.

The daemon holds the embedding model, reranker, and database connection. MCP serve instances become thin stdio-to-HTTP bridges. Claude Code sees no difference.

# Start the daemon (first session, or standalone)
erinra serve --web

# Subsequent sessions automatically use relay mode
erinra serve          # detects daemon, skips model loading

If the daemon is unavailable, serve falls back to standalone mode transparently.

Data Storage

All data is stored in ~/.erinra/db.sqlite. The schema uses:

  • memories -- content, type, projects, tags, timestamps, access counts
  • links -- directed relationships between memories (e.g., supersedes, related_to, caused_by)
  • tombstones -- tracks archived/deleted entities for sync convergence
  • FTS5 index -- full-text search on memory content
  • sqlite-vec index -- vector similarity search on embeddings

Embeddings are generated locally using fastembed (ONNX runtime) and are never sent to external services.

Architecture

erinra/
  src/
    main.rs         # CLI entry point (clap): serve, export, import, sync, reembed, status, dash, models
    config.rs       # TOML config with env var overrides and validation
    relay.rs        # Stdio-to-HTTP relay for shared daemon mode
    mcp/            # MCP server, stdio JSON-RPC, tool handlers (11 tools)
    db/             # SQLite schema, queries, migrations, hybrid search + RRF merge
    embedding/      # fastembed wrapper, embedding + reranker model management
    sync/           # JSONL export/import, background sync (filesystem watching + polling)
    web/            # Axum web server, dashboard SPA, daemon coordination, auth, MCP HTTP endpoint
  web/              # SvelteKit 2 + Svelte 5 + Tailwind CSS 4 frontend SPA

Design Principles

  • No server-side reasoning -- the LLM decides what to store, merge, or archive. The server never deduplicates, detects contradictions, or merges automatically.
  • Non-destructive -- store always creates a new memory. archive soft-deletes. Only update overwrites content.
  • Offline-first -- local embeddings, local storage, no network calls. Sync is opt-in via shared filesystem.
  • Concurrency-safe -- SQLite WAL mode supports multiple concurrent readers. Background sync and MCP tool handlers share a mutex-protected database connection.

License

MIT