Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Erinra
A memory MCP server for LLM coding assistants. Single Rust binary, local SQLite storage, local embeddings, zero external dependencies.
Erinra stores, indexes, and retrieves memories via the Model Context Protocol (MCP) over stdio transport. The server handles storage and retrieval; the calling LLM makes all semantic decisions (deduplication, merging, corrections).
Features
- Hybrid search -- vector similarity (sqlite-vec) + full-text search (FTS5), merged via Reciprocal Rank Fusion
- Cross-encoder reranking (experimental) -- optional second-stage reranking with JINA Reranker for improved search precision
- Local embeddings -- fastembed with Nomic Embed Text v1.5, no external API calls
- Non-destructive --
storealways creates,archiveinstead of delete,updateis the only destructive operation - Fat responses --
storereturns top-3 similar memories so the LLM can react without follow-up calls - Web dashboard -- browser-based UI for browsing, searching, and inspecting memories
- Shared daemon -- multiple MCP sessions share one embedding model and database via a background daemon, saving ~288 MB per additional session
- Multi-machine sync -- JSONL export/import with background sync via filesystem watching or polling
- Single file -- all data in one SQLite database with WAL mode for concurrent access
Quick Start
Install from crates.io
Install from GitHub
Or build from source
Requires Node.js (for building the web dashboard frontend).
Add to Claude Code
The first run downloads the Nomic embedding model (~137 MB) to ~/.erinra/models/. Subsequent starts are fast.
Verify it works
Open the web dashboard
MCP Tools
| Tool | Description |
|---|---|
| store | Save a memory. Returns similar existing memories for dedup detection. |
| search | Find memories by semantic similarity and keyword matching. |
| get | Fetch full details of specific memories by ID. |
| list | Browse and filter memories without a search query. |
| discover | View taxonomy (projects, types, tags) with usage counts and stats. |
| context | Batched session-start search: run multiple queries in one call with a content budget. |
| update | Modify content or metadata. Re-embeds automatically on content change. |
| archive | Soft-delete a memory. Excluded from search/list by default. |
| merge | Combine multiple memories into one. Archives sources with supersedes links. |
| link | Create a directed relationship between memories. |
| unlink | Remove a relationship. |
CLI Commands
Global options
erinra [OPTIONS] <COMMAND>
Options:
--data-dir <PATH> Data directory [default: ~/.erinra] [env: ERINRA_DATA_DIR]
-h, --help Print help
-V, --version Print version
serve -- Start MCP server
erinra serve [OPTIONS]
Starts the MCP server on stdio transport. When a web dashboard daemon is running, serve automatically uses relay mode -- bridging stdio to the daemon's HTTP endpoint instead of loading models locally. This saves ~288 MB of memory per additional session. Falls back to standalone mode if the daemon is unavailable.
| Option | Description |
|---|---|
--log-level <LEVEL> |
Override log level (e.g. debug, info, erinra=debug) |
--log-file <PATH> |
Override log file path |
--busy-timeout <MS> |
Override SQLite busy timeout in milliseconds |
--embedding-model <NAME> |
Override embedding model name |
--reranker-model <NAME> |
Override reranker model name (also enables reranking) |
--web |
Also start the web dashboard (background daemon) |
--port <PORT> |
Override web server port (requires --web) |
--bind <ADDR> |
Override web server bind address (requires --web) |
dash -- Open web dashboard
erinra dash [OPTIONS]
Starts (or joins) the background daemon and opens the dashboard in a browser. The daemon process is ref-counted -- it stays alive while any serve --web or dash process is running, and shuts down automatically when all clients exit.
| Option | Description |
|---|---|
--port <PORT> |
Override web server port |
--bind <ADDR> |
Override web server bind address |
--no-open |
Don't open the browser automatically |
--open-only |
Open the browser and exit immediately (requires a running daemon) |
export -- Export memories
erinra export <OUTPUT> [OPTIONS]
| Option | Description |
|---|---|
--gzip |
Compress output with gzip |
--since <TIMESTAMP> |
Only export memories created/updated after this timestamp |
import -- Import memories
erinra import <INPUT>
Imports from JSONL file. Auto-detects gzip compression.
sync -- Run sync cycle
erinra sync [OPTIONS]
| Option | Description |
|---|---|
--force |
Run even if sync is not enabled in config |
reembed -- Regenerate embeddings
erinra reembed [OPTIONS]
| Option | Description |
|---|---|
--model <NAME> |
Override the embedding model |
status -- Print database stats
erinra status
models -- List available models
erinra models
Configuration
Erinra reads configuration from ~/.erinra/config.toml. All settings have sensible defaults.
Precedence: CLI args > environment variables > config file > defaults.
Embedding
[]
= "NomicEmbedTextV15Q" # Embedding model name (run `erinra models` to list)
| Env var | Config key |
|---|---|
ERINRA_EMBEDDING_MODEL |
embedding.model |
Reranker (experimental)
Note: Reranking is experimental and subject to change. Short or single-keyword queries may receive unexpectedly low scores from the cross-encoder, causing results to be filtered out. Consider setting a low threshold (e.g.
-5) or disabling threshold filtering if you encounter missing results.
Optional cross-encoder reranking for improved search precision. Downloads the reranker model (~151 MB) on first use.
[]
= false # Enable cross-encoder reranking
= "JINARerankerV1TurboEn" # Reranker model name
= 0.0 # Minimum score to include (can be negative)
| Env var | Config key |
|---|---|
ERINRA_RERANKER_ENABLED |
reranker.enabled |
ERINRA_RERANKER_MODEL |
reranker.model |
ERINRA_RERANKER_THRESHOLD |
reranker.threshold |
Store
[]
= 3 # Similar memories returned by store/merge
= 0.5 # Minimum cosine similarity for similar results (0.0-1.0)
= 500 # Content truncation for search/list responses (chars)
= 10240 # Maximum content size on store/update (bytes)
| Env var | Config key |
|---|---|
ERINRA_STORE_SIMILAR_LIMIT |
store.similar_limit |
ERINRA_STORE_SIMILAR_THRESHOLD |
store.similar_threshold |
ERINRA_STORE_CONTENT_MAX_LENGTH |
store.content_max_length |
ERINRA_STORE_MAX_CONTENT_SIZE |
store.max_content_size |
Search
[]
= 60 # Reciprocal Rank Fusion constant
| Env var | Config key |
|---|---|
ERINRA_SEARCH_RRF_K |
search.rrf_k |
Database
[]
= 5000 # SQLite busy timeout (ms)
| Env var | Config key |
|---|---|
ERINRA_DATABASE_BUSY_TIMEOUT |
database.busy_timeout |
Logging
[]
= "info" # Tracing filter directive
# log_file = "/path/to/erinra.log" # Optional file logging (must be absolute path)
| Env var | Config key |
|---|---|
ERINRA_LOG_LEVEL |
logging.log_level |
ERINRA_LOG_FILE |
logging.log_file |
Web Dashboard
[]
= 9898 # Dashboard + MCP HTTP port
= "127.0.0.1" # Bind address
| Env var | Config key |
|---|---|
ERINRA_WEB_PORT |
web.port |
ERINRA_WEB_BIND |
web.bind |
Sync
[]
= false # Enable background sync in serve mode
= "~/.erinra/sync" # Directory for sync exports/imports
= "{hostname}" # Export filename template
= "jsonl.gz" # Export format: jsonl, jsonl.gz, json, json.gz
= 900 # Seconds between periodic exports
= 0 # 0 = filesystem watching, >0 = polling interval (seconds)
= false # Import peer exports before MCP server starts
= false # Run final export on shutdown
= 90 # Purge tombstones older than this during export (days)
| Env var | Config key |
|---|---|
ERINRA_SYNC_ENABLED |
sync.enabled |
ERINRA_SYNC_DIR |
sync.sync_dir |
ERINRA_SYNC_FILENAME |
sync.filename |
ERINRA_SYNC_FORMAT |
sync.format |
ERINRA_SYNC_EXPORT_INTERVAL |
sync.export_interval |
ERINRA_SYNC_POLL_INTERVAL |
sync.poll_interval |
ERINRA_SYNC_RESTORE_ON_START |
sync.restore_on_start |
ERINRA_SYNC_EXPORT_ON_EXIT |
sync.export_on_exit |
ERINRA_SYNC_TOMBSTONE_RETENTION_DAYS |
sync.tombstone_retention_days |
The filename template supports the following placeholders:
| Placeholder | Resolves to | Example |
|---|---|---|
{hostname} |
Machine hostname | my-desktop |
{os} |
Operating system (linux, macos, windows) |
linux |
{platform} |
Like {os} but detects WSL as wsl |
wsl |
{distro} |
Linux distro ID from /etc/os-release (falls back to OS) |
fedora |
{user} |
Current username | alice |
Placeholders can be combined (e.g. {hostname}-{os}). The resolved filename is sanitized to remove characters unsafe for filenames. The format extension (e.g. .jsonl.gz) is appended automatically.
Multi-Machine Sync
Erinra supports syncing memories between machines via a shared folder (e.g., Syncthing).
-
Enable sync in
config.toml:[] = true = "~/.erinra/sync" = true = true -
Share the sync directory between machines using Syncthing, Dropbox, or similar.
-
Each machine exports to a machine-specific file (named by hostname) and imports others' exports. Conflict resolution uses last-write-wins (newer
updated_atwins, local wins on tie). Tombstones propagate archives and deletes across machines.
Shared Daemon
When the web dashboard daemon is running, erinra serve automatically connects to it via HTTP relay instead of loading its own embedding model and reranker. This saves ~288 MB of memory per additional concurrent session.
The daemon holds the embedding model, reranker, and database connection. MCP serve instances become thin stdio-to-HTTP bridges. Claude Code sees no difference.
# Start the daemon (first session, or standalone)
# Subsequent sessions automatically use relay mode
If the daemon is unavailable, serve falls back to standalone mode transparently.
Data Storage
All data is stored in ~/.erinra/db.sqlite. The schema uses:
- memories -- content, type, projects, tags, timestamps, access counts
- links -- directed relationships between memories (e.g.,
supersedes,related_to,caused_by) - tombstones -- tracks archived/deleted entities for sync convergence
- FTS5 index -- full-text search on memory content
- sqlite-vec index -- vector similarity search on embeddings
Embeddings are generated locally using fastembed (ONNX runtime) and are never sent to external services.
Architecture
erinra/
src/
main.rs # CLI entry point (clap): serve, export, import, sync, reembed, status, dash, models
config.rs # TOML config with env var overrides and validation
relay.rs # Stdio-to-HTTP relay for shared daemon mode
mcp/ # MCP server, stdio JSON-RPC, tool handlers (11 tools)
db/ # SQLite schema, queries, migrations, hybrid search + RRF merge
embedding/ # fastembed wrapper, embedding + reranker model management
sync/ # JSONL export/import, background sync (filesystem watching + polling)
web/ # Axum web server, dashboard SPA, daemon coordination, auth, MCP HTTP endpoint
web/ # SvelteKit 2 + Svelte 5 + Tailwind CSS 4 frontend SPA
Design Principles
- No server-side reasoning -- the LLM decides what to store, merge, or archive. The server never deduplicates, detects contradictions, or merges automatically.
- Non-destructive --
storealways creates a new memory.archivesoft-deletes. Onlyupdateoverwrites content. - Offline-first -- local embeddings, local storage, no network calls. Sync is opt-in via shared filesystem.
- Concurrency-safe -- SQLite WAL mode supports multiple concurrent readers. Background sync and MCP tool handlers share a mutex-protected database connection.
License
MIT