# rmcp-memex
RAG/Memory MCP Server with LanceDB vector storage for AI agents.
## Overview
`rmcp-memex` is an MCP (Model Context Protocol) server providing:
- **RAG (Retrieval-Augmented Generation)** - document indexing and semantic search
- **Hybrid Search** - BM25 keyword + semantic vector search (Tantivy-based)
- **Vector Memory** - semantic storage and retrieval of text chunks
- **Namespace Isolation** - data isolation in namespaces
- **Security** - token-based access control for protected namespaces
- **Onion Slice Architecture** - hierarchical embeddings (OUTER→MIDDLE→INNER→CORE)
- **Preprocessing** - automatic noise filtering from conversation exports (~36-40% reduction)
- **Exact-Match Deduplication** - SHA256-based dedup for overlapping exports
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ rmcp-memex │
├─────────────────────────────────────────────────────────────┤
│ MCP Server (JSON-RPC over stdio) │
│ ├── handlers/mod.rs - Request routing & validation │
│ ├── security/mod.rs - Namespace access control │
│ └── rag/mod.rs - RAG pipeline │
├─────────────────────────────────────────────────────────────┤
│ Storage Layer │
│ ├── LanceDB - Vector embeddings │
│ ├── Tantivy - BM25 keyword index │
│ └── moka - In-memory cache │
├─────────────────────────────────────────────────────────────┤
│ Embeddings (External Providers) │
│ ├── Ollama - Local models (recommended) │
│ ├── MLX Bridge - Apple Silicon acceleration │
│ └── OpenAI-compatible - Any compatible endpoint │
└─────────────────────────────────────────────────────────────┘
```
## Features
### RAG Tools
| `rag_index` | Index document from file |
| `rag_index_text` | Index raw text |
| `rag_search` | Search documents semantically |
### Memory Tools
| `memory_upsert` | Add/update chunk in namespace |
| `memory_get` | Get chunk by ID |
| `memory_search` | Search semantically in namespace |
| `memory_delete` | Delete chunk |
| `memory_purge_namespace` | Delete all chunks in namespace |
### Security Tools
| `namespace_create_token` | Create access token for namespace |
| `namespace_revoke_token` | Revoke token (namespace becomes public) |
| `namespace_list_protected` | List protected namespaces |
| `namespace_security_status` | Security system status |
## Quick Start
### Installation
**Quick install (recommended):**
```bash
**From source:**
```bash
cargo install --path .
```
### Running
```bash
# Default mode (all features)
rmcp-memex serve
# Memory-only mode (no filesystem access)
rmcp-memex serve --mode memory
# With security enabled
rmcp-memex serve --security-enabled
```
### Configuration (TOML)
```toml
# ~/.rmcp-servers/config/rmcp-memex.toml
mode = "full"
db_path = "~/.rmcp-servers/rmcp-memex/lancedb"
cache_mb = 4096
log_level = "info"
# Whitelist of allowed paths
allowed_paths = [
"~",
"/Volumes/ExternalDrive/data"
]
# Security
security_enabled = true
token_store_path = "~/.rmcp-servers/rmcp-memex/tokens.json"
```
## Documentation
- [01_security.md](./01_security.md) - Security system (namespace tokens)
- [02_configuration.md](./02_configuration.md) - Configuration and CLI options
## Onion Slice Architecture
Instead of traditional flat chunking, rmcp-memex offers hierarchical "onion slices":
```
┌─────────────────────────────────────────┐
│ OUTER (~100 chars) │ ← Minimum context, maximum navigation
│ Keywords + ultra-compression │
├─────────────────────────────────────────┤
│ MIDDLE (~300 chars) │ ← Key sentences + context
├─────────────────────────────────────────┤
│ INNER (~600 chars) │ ← Expanded content
├─────────────────────────────────────────┤
│ CORE (full text) │ ← Complete document
└─────────────────────────────────────────┘
```
**Philosophy:** "Minimum info → Maximum navigation paths"
### CLI Commands
```bash
# Index with onion slicing (default)
rmcp-memex index -n memories /path/to/data/ --slice-mode onion
# Index with progress bar and ETA
rmcp-memex index -n memories /path/to/data/ --progress
# Index with flat chunking (backward compatible)
rmcp-memex index -n memories /path/to/data/ --slice-mode flat
# Search in namespace
rmcp-memex search -n memories -q "best moments" --limit 10
# Search only in specific layer
rmcp-memex search -n memories -q "query" --layer outer
# Drill down in hierarchy (expand children)
rmcp-memex expand -n memories -i "slice_id_here"
# Get chunk by ID
rmcp-memex get -n memories -i "chunk_abc123"
# RAG search (cross-namespace)
rmcp-memex rag-search -q "search term" --limit 5
# List namespaces with stats
rmcp-memex namespaces --stats
# Export namespace to JSON
rmcp-memex export -n memories -o backup.json --include-embeddings
```
### Preprocessing (Noise Filtering)
Automatic removal of ~36-40% noise from conversation exports:
- MCP tool artifacts (`<function_calls>`, `<invoke>`, etc.)
- CLI output (git status, cargo build, npm install)
- Metadata (UUIDs, timestamps → placeholders)
- Empty/boilerplate content
```bash
# Index with preprocessing
rmcp-memex index -n memories /path/to/export.json --preprocess
```
### Exact-Match Deduplication
SHA256-based dedup for overlapping exports (e.g., quarterly exports containing 6 months of data):
```bash
# Dedup enabled (default)
rmcp-memex index -n memories /path/to/data/
# Disable dedup
rmcp-memex index -n memories /path/to/data/ --no-dedup
```
**Output with statistics:**
```
Indexing complete:
New chunks: 234
Files indexed: 67
Skipped (duplicate): 33
Deduplication: enabled
```
## Code Structure
```
rmcp-memex/
├── src/
│ ├── lib.rs # Public API & ServerConfig
│ ├── bin/
│ │ └── rmcp-memex.rs # CLI binary (serve, index, search, get, expand, etc.)
│ ├── handlers/
│ │ └── mod.rs # MCP request handlers
│ ├── security/
│ │ └── mod.rs # Namespace access control
│ ├── rag/
│ │ └── mod.rs # RAG pipeline + OnionSlice architecture
│ ├── preprocessing/
│ │ └── mod.rs # Noise filtering for conversation exports
│ ├── storage/
│ │ └── mod.rs # LanceDB + Tantivy (schema v3 with content_hash)
│ ├── embeddings/
│ │ └── mod.rs # MLX/FastEmbed bridge
│ └── tui/
│ └── mod.rs # Configuration wizard
└── Cargo.toml
```
## Claude/MCP Integration
Add to `~/.claude.json`:
```json
{
"mcpServers": {
"rmcp-memex": {
"command": "rmcp-memex",
"args": ["serve", "--security-enabled"]
}
}
}
```
---
Created by M&K (c)2025 The LibraxisAI Team
Co-Authored-By: [Maciej](void@div0.space) & [Klaudiusz](the1st@whoai.am)