MemRL - Memory-Augmented Reinforcement Learning for Claude Code
MemRL gives Claude Code a persistent memory that learns from experience. Instead of starting fresh each session, Claude can recall past solutions, learn what works, and get smarter over time.
Why MemRL?
The Problem: Claude Code forgets everything between sessions. You solve the same problems repeatedly, and Claude can't learn from past successes or failures.
The Solution: MemRL captures coding sessions as "episodes", indexes them for semantic search, and uses reinforcement learning to surface the most valuable memories when relevant.
Without MemRL: With MemRL:
┌─────────────┐ ┌─────────────┐
│ Session 1 │ ──forgotten──> │ Session 1 │ ──captured──┐
└─────────────┘ └─────────────┘ │
┌─────────────┐ ┌─────────────┐ ▼
│ Session 2 │ ──forgotten──> │ Session 2 │ ◄──recalls──┤
└─────────────┘ └─────────────┘ │
┌─────────────┐ ┌─────────────┐ │
│ Session 3 │ ──forgotten──> │ Session 3 │ ◄──recalls──┘
└─────────────┘ └─────────────┘
│ │
▼ ▼
No learning Continuous improvement
How It Works
The Learning Loop
┌────────────────────────────────────────────────────────────────┐
│ 1. START TASK │
│ User: "Fix the login redirect bug" │
└────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ 2. RETRIEVE MEMORIES │
│ Claude searches: "login redirect bug" │
│ Finds: "Fixed similar issue by sanitizing return URLs" │
└────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ 3. SOLVE FASTER │
│ Claude uses past experience to solve the problem │
└────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ 4. CAPTURE SESSION │
│ Claude saves: what was done, what worked, what failed │
└────────────────────────────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────┐
│ 5. LEARN FROM FEEDBACK │
│ User: "That memory was helpful!" │
│ → Episode utility increases │
│ → Similar episodes get boosted (Bellman propagation) │
│ → Unhelpful memories fade over time │
└────────────────────────────────────────────────────────────────┘
What Makes It "Learn"
| Mechanism | What It Does |
|---|---|
| Feedback | Helpful episodes gain utility score |
| Bellman Propagation | Value spreads to semantically similar episodes |
| Temporal Credit | Episodes before successes get credit |
| Decay | Unused memories fade (1% per day) |
| Retrieval Ranking | High-utility episodes surface first |
Over time, frequently helpful knowledge rises to the top, while stale or unhelpful memories fade away.
Installation
Build from Source
# Clone and build
# Two binaries are created:
# - target/release/memrl (CLI tool)
# - target/release/memrl-mcp (MCP server for Claude Code)
Install from crates.io
First Run - Model Download
On first use, MemRL downloads the BGE-Small embedding model (~128MB) for semantic search. This happens automatically and only once:
# Initialize and trigger model download
# Output:
# 🔄 Loading embedding model (this may download the model on first run)...
# ✅ Embedding model loaded
The model is cached globally at ~/.memrl/models/ and shared across all projects.
Setup with Claude Code
1. Add the MCP Server
The --scope user flag makes it available across all your projects.
2. Restart Claude Code
Exit and restart Claude Code to load the new MCP server.
3. Verify
Run /mcp in Claude Code. You should see memrl with 7 tools.
MCP Tools
Once connected, Claude has access to these tools:
| Tool | Description | When to Use |
|---|---|---|
memrl_retrieve |
Search memories by query, list all, or show details | Start of session - always check first |
memrl_capture |
Save session as episode (auto-propagates utility) | End of task - capture successes proactively |
memrl_feedback |
Mark episodes as helpful/not helpful | After using retrieved memories |
memrl_status |
Check memory health for current project | Understand memory state |
memrl_stats |
View overall memory statistics | Analytics and monitoring |
memrl_propagate |
Spread value to similar episodes | Periodic maintenance |
memrl_review |
Consolidate and cleanup memories | After related task series |
Key Lifecycle Behaviors
Start of session: Claude should automatically check for relevant memories using memrl_retrieve before starting non-trivial tasks.
End of task: Claude should proactively capture successful sessions using memrl_capture - don't wait to be asked. The capture tool automatically runs utility propagation.
Cross-project insights: Use memrl_capture(project: "other-project") to tag memories for a different project.
memrl_retrieve Modes
Search: memrl_retrieve(query: "authentication bug")
List all: memrl_retrieve(all: true)
Show details: memrl_retrieve(query: "abc12345") # episode ID
Example Conversation
User: Fix the login redirect vulnerability
Claude: Let me check if we've solved similar problems before...
[Calls memrl_retrieve("login redirect vulnerability")]
Found 2 relevant past experiences:
1. "Fixed open redirect by validating return URLs against whitelist"
- 78% similarity, 85% utility
2. "Auth redirect issue - added URL sanitization"
- 65% similarity, 72% utility
Based on past experience, I'll implement URL validation...
[Works on the fix]
Done! Let me save this session.
[Calls memrl_capture(summary: "Fixed login redirect...", outcome: "success")]
User: That retrieved memory was really helpful!
Claude: [Calls memrl_feedback(helpful: true, episodes: ["abc123"])]
Marked as helpful - this will improve future retrievals!
CLI Commands
# Initialize MemRL
# Capture an episode manually
# Index episodes for semantic search
# Search memories
# Provide feedback
# Run utility propagation
# Prune old/low-value episodes
# View statistics
Data Storage
MemRL stores everything locally in ~/.memrl/ (shared across all projects):
~/.memrl/
├── config.toml # Configuration
├── episodes/ # Episode JSON files
│ └── 2026-01-25/
│ └── session-abc123.json
├── vectors/ # Vector database
│ └── episodes.lance/ # LanceDB embeddings
└── models/ # Embedding model cache (~128MB)
└── models--Xenova--bge-small-en-v1.5/
All projects share the same memory database, enabling cross-project learning.
The RL Behind the Scenes
MemRL uses reinforcement learning concepts:
| Parameter | Default | Purpose |
|---|---|---|
decay_rate |
0.01 | 1% utility decay per day |
discount_factor |
0.9 | RL gamma for Bellman updates |
learning_rate |
0.1 | Conservative alpha for updates |
propagation_threshold |
0.5 | Min similarity for propagation |
Episode Lifecycle:
Captured → Indexed → Retrieved → Feedback → Utility Updated → Propagated
↓
[Low utility + old]
↓
Pruned
Maintenance
Run periodically to keep memory healthy:
# Weekly: Propagate utility values
# Monthly: Clean up old/useless episodes
# As needed: Check health
Environment Variables
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
For LLM-based intent extraction (--extract-intent) |
MEMRL_DATA_DIR |
Override default data directory |
Troubleshooting
MCP server not loading
- Check path:
ls /path/to/memrl-mcp - Check config:
cat ~/.claude.json - Restart Claude Code completely
- Run
/mcpto verify
Embeddings slow on first run
The BGE-Small model (~128MB) downloads on first use from HuggingFace. This requires internet access. After download, the model is cached at ~/.memrl/models/ and works offline.
Vector search not finding anything
Run memrl index to create/update the vector database.
Model download fails
If behind a firewall or proxy, ensure access to huggingface.co. The model files are downloaded via HTTPS.
License
Apache 2.0
Contributing
Contributions welcome! Please open an issue or PR.