rmcp-memex
RAG/Memory MCP Server with LanceDB vector storage for AI agents.
Overview
rmcp-memex is an MCP (Model Context Protocol) server providing:
- RAG (Retrieval-Augmented Generation) - document indexing and semantic search
- Hybrid Search - BM25 keyword + semantic vector search (Tantivy-based)
- Vector Memory - semantic storage and retrieval of text chunks
- Namespace Isolation - data isolation in namespaces
- Security - token-based access control for protected namespaces
- Onion Slice Architecture - hierarchical embeddings (OUTER→MIDDLE→INNER→CORE)
- Preprocessing - automatic noise filtering from conversation exports (~36-40% reduction)
- Exact-Match Deduplication - SHA256-based dedup for overlapping exports
Architecture
┌─────────────────────────────────────────────────────────────┐
│ rmcp-memex │
├─────────────────────────────────────────────────────────────┤
│ MCP Server (JSON-RPC over stdio) │
│ ├── handlers/mod.rs - Request routing & validation │
│ ├── security/mod.rs - Namespace access control │
│ └── rag/mod.rs - RAG pipeline │
├─────────────────────────────────────────────────────────────┤
│ Storage Layer │
│ ├── LanceDB - Vector embeddings │
│ ├── Tantivy - BM25 keyword index │
│ └── moka - In-memory cache │
├─────────────────────────────────────────────────────────────┤
│ Embeddings (External Providers) │
│ ├── Ollama - Local models (recommended) │
│ ├── MLX Bridge - Apple Silicon acceleration │
│ └── OpenAI-compatible - Any compatible endpoint │
└─────────────────────────────────────────────────────────────┘
Features
RAG Tools
| Tool | Description |
|---|---|
rag_index |
Index document from file |
rag_index_text |
Index raw text |
rag_search |
Search documents semantically |
Memory Tools
| Tool | Description |
|---|---|
memory_upsert |
Add/update chunk in namespace |
memory_get |
Get chunk by ID |
memory_search |
Search semantically in namespace |
memory_delete |
Delete chunk |
memory_purge_namespace |
Delete all chunks in namespace |
Security Tools
| Tool | Description |
|---|---|
namespace_create_token |
Create access token for namespace |
namespace_revoke_token |
Revoke token (namespace becomes public) |
namespace_list_protected |
List protected namespaces |
namespace_security_status |
Security system status |
Quick Start
Installation
Quick install (recommended):
|
From source:
Running
# Default mode (all features)
# Memory-only mode (no filesystem access)
# With security enabled
Configuration (TOML)
# ~/.rmcp-servers/config/rmcp-memex.toml
= "full"
= "~/.rmcp-servers/rmcp-memex/lancedb"
= 4096
= "info"
# Whitelist of allowed paths
= [
"~",
"/Volumes/ExternalDrive/data"
]
# Security
= true
= "~/.rmcp-servers/rmcp-memex/tokens.json"
Documentation
- 01_security.md - Security system (namespace tokens)
- 02_configuration.md - Configuration and CLI options
Onion Slice Architecture
Instead of traditional flat chunking, rmcp-memex offers hierarchical "onion slices":
┌─────────────────────────────────────────┐
│ OUTER (~100 chars) │ ← Minimum context, maximum navigation
│ Keywords + ultra-compression │
├─────────────────────────────────────────┤
│ MIDDLE (~300 chars) │ ← Key sentences + context
├─────────────────────────────────────────┤
│ INNER (~600 chars) │ ← Expanded content
├─────────────────────────────────────────┤
│ CORE (full text) │ ← Complete document
└─────────────────────────────────────────┘
Philosophy: "Minimum info → Maximum navigation paths"
CLI Commands
# Index with onion slicing (default)
# Index with progress bar and ETA
# Index with flat chunking (backward compatible)
# Search in namespace
# Search only in specific layer
# Drill down in hierarchy (expand children)
# Get chunk by ID
# RAG search (cross-namespace)
# List namespaces with stats
# Export namespace to JSON
Preprocessing (Noise Filtering)
Automatic removal of ~36-40% noise from conversation exports:
- MCP tool artifacts (
<function_calls>,<invoke>, etc.) - CLI output (git status, cargo build, npm install)
- Metadata (UUIDs, timestamps → placeholders)
- Empty/boilerplate content
# Index with preprocessing
Exact-Match Deduplication
SHA256-based dedup for overlapping exports (e.g., quarterly exports containing 6 months of data):
# Dedup enabled (default)
# Disable dedup
Output with statistics:
Indexing complete:
New chunks: 234
Files indexed: 67
Skipped (duplicate): 33
Deduplication: enabled
Code Structure
rmcp-memex/
├── src/
│ ├── lib.rs # Public API & ServerConfig
│ ├── bin/
│ │ └── rmcp-memex.rs # CLI binary (serve, index, search, get, expand, etc.)
│ ├── handlers/
│ │ └── mod.rs # MCP request handlers
│ ├── security/
│ │ └── mod.rs # Namespace access control
│ ├── rag/
│ │ └── mod.rs # RAG pipeline + OnionSlice architecture
│ ├── preprocessing/
│ │ └── mod.rs # Noise filtering for conversation exports
│ ├── storage/
│ │ └── mod.rs # LanceDB + Tantivy (schema v3 with content_hash)
│ ├── embeddings/
│ │ └── mod.rs # MLX/FastEmbed bridge
│ └── tui/
│ └── mod.rs # Configuration wizard
└── Cargo.toml
Claude/MCP Integration
Add to ~/.claude.json:
Created by M&K (c)2025 The LibraxisAI Team Co-Authored-By: Maciej & Klaudiusz