We built this because AI agents forget everything between sessions. They make the same mistakes, ask the same questions, lose context constantly.
Shodh-Memory fixes that. It's a cognitive memory system—Hebbian learning, activation decay, semantic consolidation—packed into a single 8MB binary that runs offline.
How it works:
Experiences flow through three tiers based on Cowan's working memory model [1]. New information enters capacity-limited working memory, overflows into session storage, and consolidates into long-term memory based on importance. When memories are retrieved together successfully, their connections strengthen—classic Hebbian learning [2]. After enough co-activations, those connections become permanent. Unused memories naturally fade.
Working Memory ──overflow──▶ Session Memory ──importance──▶ Long-Term Memory
(100 items) (500 MB) (RocksDB)
Architecture
Storage & Retrieval
- Vamana graph index for approximate nearest neighbor search [3]
- MiniLM-L6 embeddings (384-dim, 25MB) for semantic similarity
- TinyBERT NER (15MB) for entity extraction (Person, Organization, Location, Misc)
- RocksDB for durable persistence across restarts
Cognitive Processing
- Named entity recognition — TinyBERT extracts entities; boosts importance and enables graph relationships
- Spreading activation retrieval — queries activate related memories through semantic and graph connections [5]
- Activation decay — exponential decay A(t) = A₀ · e^(-λt) applied each maintenance cycle
- Hebbian strengthening — co-retrieved memories form graph edges; weight increases on co-activation
- Long-term potentiation — edges surviving threshold co-activations become permanent
Semantic Consolidation
- Episodic memories older than 7 days compress into semantic facts
- Entity extraction preserves key information during compression
Use cases
Local LLM memory — Give Claude, GPT, or any local model persistent memory across sessions.
Robotics & drones — On-device experience accumulation without cloud round-trips.
Edge AI — Run on Jetson, Raspberry Pi, industrial PCs. Sub-millisecond retrieval, zero network dependency.
Personal knowledge base — Your own searchable memory. Decisions, learnings, discoveries—private and local.
Compared to alternatives
| Shodh-Memory | Mem0 | Cognee | |
|---|---|---|---|
| Deployment | Single 8MB binary | Cloud API | Neo4j + Vector DB |
| Offline | 100% | No | Partial |
| Learning | Hebbian + decay + LTP | Vector similarity | Knowledge graphs |
| Latency | Sub-millisecond | Network-bound | Database-bound |
| Best for | Local-first, edge, privacy | Cloud scale | Enterprise ETL |
Performance
Measured on Intel i7-1355U (10 cores, 1.7GHz), release build.
API Latencies
| Endpoint | Operation | Latency |
|---|---|---|
POST /api/remember |
Store memory (existing user) | 55-60ms |
POST /api/recall |
Semantic search | 34-58ms |
POST /api/recall/tags |
Tag-based search | ~1ms |
GET /api/list |
List memories | ~1ms |
GET /health |
Health check | ~1ms |
Knowledge Graph (Criterion benchmarks)
| Operation | Latency |
|---|---|
| Entity lookup | 763ns |
| Relationship query | 2.2µs |
| Hebbian strengthen | 5.7µs |
| Graph traversal (3-hop) | 30µs |
Neural Models
| Model | Operation | Latency |
|---|---|---|
| MiniLM-L6-v2 (25MB) | Embedding (384-dim) | 33ms |
| TinyBERT-NER (15MB) | Entity extraction | 15ms |
Installation
Claude Code / Claude Desktop:
Add to your claude_desktop_config.json:
Config file locations:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json - Linux:
~/.config/Claude/claude_desktop_config.json
Python:
pip install shodh-memory
From source:
cargo build --release
./target/release/shodh-memory-server
Usage
Python
=
# Store
# Search
=
# Get memory statistics
=
REST API
# Store
# Search
Memory types
Different types get different importance weights in the scoring model:
- Decision (+0.30) — choices, preferences, conclusions
- Learning (+0.25) — new knowledge, facts learned
- Error (+0.25) — mistakes, things to avoid
- Discovery, Pattern (+0.20) — findings, recurring behaviors
- Task (+0.15) — work items
- Context, Observation (+0.10) — general info
API reference
Python client (API parity with REST)
| Method | What it does |
|---|---|
| Core Memory | |
remember(content, memory_type, tags, ...) |
Store a memory |
recall(query, limit, mode, ...) |
Semantic search |
list_memories(limit, memory_type) |
List all memories |
get_memory(memory_id) |
Get single memory by ID |
get_stats() |
Memory statistics |
| Forget Operations | |
forget(memory_id) |
Delete single memory by ID |
forget_by_age(days) |
Delete memories older than N days |
forget_by_importance(threshold) |
Delete low-importance memories |
forget_by_pattern(regex) |
Delete memories matching pattern |
forget_by_tags(tags) |
Delete memories by tags |
forget_by_date(start, end) |
Delete memories in date range |
forget_all() |
Delete ALL memories (GDPR) |
| Context & Introspection | |
context_summary(max_items, ...) |
Categorized context for LLM bootstrap |
brain_state(longterm_limit) |
3-tier memory visualization |
flush() |
Flush data to disk |
REST endpoints
All protected endpoints require X-API-Key header.
| Endpoint | Method | Description | Avg Latency |
|---|---|---|---|
| Core Memory | |||
/api/remember |
POST | Store memory (embedding + NER) | 55ms |
/api/recall |
POST | Semantic search | 45ms |
/api/recall/tags |
POST | Tag-based search (no embedding) | 1ms |
/api/recall/date |
POST | Date-range search | 5ms |
/api/list/{user_id} |
GET | List all memories | 1ms |
/api/context_summary |
POST | Categorized context for session bootstrap | 15ms |
| Forget Operations | |||
/api/forget/age |
POST | Delete memories older than threshold | 5ms |
/api/forget/importance |
POST | Delete low-importance memories | 5ms |
/api/forget/pattern |
POST | Delete memories matching regex | 10ms |
/api/forget/tags |
POST | Delete memories by tags | 5ms |
/api/forget/date |
POST | Delete memories in date range | 5ms |
| Hebbian Learning | |||
/api/retrieve/tracked |
POST | Search with feedback tracking | 45ms |
/api/reinforce |
POST | Hebbian reinforcement feedback | 10ms |
| Batch & Consolidation | |||
/api/batch_remember |
POST | Store multiple memories | 55ms/item |
/api/consolidate |
POST | Trigger semantic consolidation | 250ms |
| Introspection | |||
/api/memory/{id} |
GET/PUT/DELETE | Single memory operations | 10ms |
/api/users/{id}/stats |
GET | User statistics | 10ms |
/api/graph/{id}/stats |
GET | Knowledge graph statistics | 10ms |
/api/brain/{user_id} |
GET | 3-tier state visualization | 50ms |
/api/search/advanced |
POST | Multi-filter search | 50ms |
| Health & Metrics | |||
/health |
GET | Health check (no auth) | <1ms |
/health/live |
GET | Kubernetes liveness (no auth) | <1ms |
/health/ready |
GET | Kubernetes readiness (no auth) | <1ms |
/metrics |
GET | Prometheus metrics (no auth) | <1ms |
Authentication
# Development mode (SHODH_API_KEYS not set)
# Production mode (required)
Configuration
SHODH_PORT=3030 # Default: 3030
SHODH_MEMORY_PATH=./data # Default: ./shodh_memory_data
SHODH_API_KEYS=key1,key2 # Required in production
SHODH_MAINTENANCE_INTERVAL=300 # Decay cycle (seconds)
SHODH_ACTIVATION_DECAY=0.95 # Decay factor per cycle
Platform support
| Platform | Status | Use case |
|---|---|---|
| Linux x86_64 | ✓ | Servers, workstations |
| macOS ARM64 | ✓ | Development (Apple Silicon) |
| Windows x86_64 | ✓ | Development, industrial PCs |
| Linux ARM64 | Coming soon | Jetson, Raspberry Pi, drones |
References
[1] Cowan, N. (2010). The Magical Mystery Four: How is Working Memory Capacity Limited, and Why? Current Directions in Psychological Science, 19(1), 51-57. https://pmc.ncbi.nlm.nih.gov/articles/PMC4207727/
[2] Magee, J.C., & Grienberger, C. (2020). Synaptic Plasticity Forms and Functions. Annual Review of Neuroscience, 43, 95-117. https://pmc.ncbi.nlm.nih.gov/articles/PMC10410470/
[3] Subramanya, S.J., et al. (2019). DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. NeurIPS 2019. https://papers.nips.cc/paper/9527-diskann-fast-accurate-billion-point-nearest-neighbor-search-on-a-single-node
[4] Dudai, Y., Karni, A., & Born, J. (2015). The Consolidation and Transformation of Memory. Neuron, 88(1), 20-32. https://pmc.ncbi.nlm.nih.gov/articles/PMC4183265/
[5] Anderson, J.R. (1983). A Spreading Activation Theory of Memory. Journal of Verbal Learning and Verbal Behavior, 22(3), 261-295.
License
Apache 2.0
MCP Registry · PyPI · npm · GitHub