mnemefusion-core 0.1.4

Unified memory engine for AI applications - Core library
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
# MnemeFusion


**Atomic memory engine for AI applications — one database per entity.**

MnemeFusion gives each entity its own self-contained memory database. Five retrieval dimensions (semantic, keyword, temporal, causal, entity profile) are fused into a single ranked result, all in one portable `.mfdb` file with zero external dependencies.

Think SQLite for AI memory: one file per user, per contact, or per conversation — embedded in your application.

[![CI](https://github.com/gkanellopoulos/mnemefusion/actions/workflows/ci.yml/badge.svg)](https://github.com/gkanellopoulos/mnemefusion/actions/workflows/ci.yml)
[![crates.io](https://img.shields.io/crates/v/mnemefusion-core.svg)](https://crates.io/crates/mnemefusion-core)
[![PyPI CPU](https://img.shields.io/pypi/v/mnemefusion-cpu.svg?label=pypi%20cpu)](https://pypi.org/project/mnemefusion-cpu/)
[![PyPI GPU](https://img.shields.io/pypi/v/mnemefusion.svg?label=pypi%20gpu)](https://pypi.org/project/mnemefusion/)
[![docs.rs](https://docs.rs/mnemefusion-core/badge.svg)](https://docs.rs/mnemefusion-core)
[![License](https://img.shields.io/badge/license-MIT%2FApache--2.0-blue.svg)](LICENSE-MIT)

*MnemeFusion was designed and directed by [George Kanellopoulos](https://github.com/gkanellopoulos), with implementation substantially assisted by [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (Anthropic). The project grew out of an exploration into building a complex, multi-dimensional AI memory engine through human-AI collaboration — the commit history reflects the authentic development process.*

## Atomic Architecture


MnemeFusion follows an **atomic design**: each entity (a user, a contact, a conversation) maps to its own `.mfdb` database file. This 1:1 mapping is the core architectural principle.

Memory retrieval degrades when unrelated conversations share a database — relevant memories get buried by noise from other entities. By scoping each database to a single entity, all five retrieval dimensions stay focused and retrieval stays precise, even as conversation history grows to thousands of turns.

This mirrors how production AI systems work: a personal assistant remembers *one user's* conversations, a CRM agent tracks *one contact's* history, a therapy bot maintains *one patient's* sessions. Each gets its own `.mfdb` file.

## Features


- **Five Retrieval Pathways**: Semantic vector search, BM25 keyword matching, temporal range queries, causal graph traversal, entity profile scoring
- **Reciprocal Rank Fusion**: Fuses all five dimensions into a single ranked result set
- **Entity Profiles**: LLM-powered entity extraction builds structured knowledge graphs from unstructured text
- **Single File Storage**: All data in one portable `.mfdb` file with ACID transactions (redb)
- **Intent Classification**: Automatic query routing (temporal, causal, entity, factual)
- **Namespace Isolation**: Multi-user memory separation
- **Rust Core**: Memory-safe, high-performance embedded library
- **Python Bindings**: First-class Python API via PyO3
- **Optional GPU Acceleration**: CUDA-accelerated entity extraction via llama-cpp

## Benchmarks


Evaluated on two established conversational memory benchmarks ([LoCoMo](evals/locomo/), [LongMemEval](evals/longmemeval/)) using standard protocols. The LongMemEval results validate the atomic architecture — per-entity databases maintain high accuracy where a shared database collapses:

| Benchmark | Mode | What it tests | Score |
|-----------|------|---------------|-------|
| [LoCoMo]evals/locomo/ | Standard | Overall accuracy across 10 conversations (1,540 questions) | **69.9% ± 0.4%** |
| [LongMemEval]evals/longmemeval/ | Oracle | Pipeline quality — extraction + RAG + scoring (500 questions) | **91.4%** |
| [LongMemEval]evals/longmemeval/ | Per-entity | Production pattern — one DB per conversation, ~500 turns each (176 questions) | **67.6%** |
| [LongMemEval]evals/longmemeval/ | Shared DB | All conversations in one DB — the anti-pattern (500 questions) | 37.2% |

**Reading the numbers:** The oracle result (91.4%) proves the pipeline works when given the right evidence. The per-entity result (67.6%) shows production performance with the recommended atomic architecture. The shared-DB result (37.2%) demonstrates why per-entity scoping matters — accuracy drops by 54 points when unrelated conversations compete for retrieval slots.

See [evals/](evals/) for full methodology, per-category breakdowns, datasets, and reproduction instructions.

## Quick Start


For a complete runnable example, see [`examples/minimal.py`](examples/minimal.py) — no GPU or GGUF model required. For an interactive demo, see the [Chat Demo](apps/) (Streamlit).

### Python


```bash
# CPU-only (development / experimentation)

pip install mnemefusion-cpu sentence-transformers

# GPU with CUDA (production — Linux x86_64, requires NVIDIA driver 525+)

pip install mnemefusion sentence-transformers
```

```python
import mnemefusion
from sentence_transformers import SentenceTransformer

model = SentenceTransformer("BAAI/bge-base-en-v1.5")

# Open or create a database (768 = BGE-base embedding dimension)

mem = mnemefusion.Memory("./brain.mfdb", {"embedding_dim": 768})

# Set embedding function for automatic vectorization

mem.set_embedding_fn(lambda text: model.encode(text).tolist())

# Add memories

mem.add("Alice loves hiking in the mountains", metadata={"speaker": "narrator"})
mem.add("Bob started learning piano last month", metadata={"speaker": "narrator"})

# Multi-dimensional query — returns (intent, results, profile_context)

intent, results, profiles = mem.query("What are Alice's hobbies?", limit=10)

print(f"Intent: {intent['intent']} (confidence: {intent['confidence']:.2f})")
for memory_dict, scores_dict in results:
    print(f"  [{scores_dict['fused_score']:.3f}] {memory_dict['content']}")

# Profile context contains entity facts for RAG augmentation

for fact_str in profiles:
    print(f"  Profile: {fact_str}")
```

### With User Identity


```python
# Namespace isolation + first-person pronoun resolution

mem = mnemefusion.Memory("./brain.mfdb", {"embedding_dim": 768}, user="alice")
mem.set_embedding_fn(lambda text: model.encode(text).tolist())

# Memories are namespaced to "alice"

mem.add("I love hiking in the mountains")

# Map "I"/"me"/"my" → "alice" entity profile at query time

mem.set_user_entity("alice")

# "my hobbies" resolves to alice's profile

intent, results, profiles = mem.query("What are my hobbies?")
```

### With LLM Entity Extraction


Entity extraction uses a local GGUF model (no cloud API needed). Download a supported model:

```bash
pip install huggingface-hub

# Recommended: Phi-4-mini (3.8B, ~2.3GB, best accuracy)*
# Requires Hugging Face authentication: huggingface-cli login
huggingface-cli download microsoft/Phi-4-mini-instruct-gguf Phi-4-mini-instruct-Q4_K_M.gguf --local-dir models/

# Alternative (no auth required): Qwen2.5-3B (~2GB)
huggingface-cli download Qwen/Qwen2.5-3B-Instruct-GGUF qwen2.5-3b-instruct-q4_k_m.gguf --local-dir models/
```

*\*MnemeFusion's extraction prompts have been tested and tuned with Phi-4-mini. Other models may work but with reduced extraction quality.*

```python
mem = mnemefusion.Memory("./brain.mfdb", {"embedding_dim": 768})
mem.set_embedding_fn(lambda text: model.encode(text).tolist())
mem.enable_llm_entity_extraction("models/Phi-4-mini-instruct-Q4_K_M.gguf", tier="balanced")

# Entity extraction runs automatically on add()
mem.add("Caroline studies marine biology at Stanford")

# Entity profiles are built incrementally
profile = mem.get_entity_profile("caroline")
# {'name': 'caroline', 'entity_type': 'person', 'facts': {...}, 'summary': '...'}
```

Requires a GPU with 4GB+ VRAM for reasonable speed. CPU-only works but is ~10x slower. For GPU acceleration, install the GPU package: `pip install mnemefusion`.

### Rust


```toml
[dependencies]
mnemefusion-core = "0.1"
```

```rust
use mnemefusion_core::{MemoryEngine, Config};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let engine = MemoryEngine::open("./brain.mfdb", Config::default())?;

    // Add a memory with embedding vector
    let embedding = vec![0.1; 384]; // From your embedding model
    engine.add(
        "Project deadline moved to March 15th".to_string(),
        embedding,
        None, // metadata
        None, // timestamp
        None, // source
        None, // namespace
    )?;

    // Query with multi-dimensional fusion
    let query_embedding = vec![0.1; 384];
    let (_intent, results, _profiles) = engine.query(
        "When is the project deadline?",
        query_embedding,
        10,    // limit
        None,  // namespace
        None,  // filters
    )?;

    for (memory, scores) in &results {
        println!("[{:.3}] {}", scores.fused_score, memory.content);
    }

    engine.close()?;
    Ok(())
}
```

## Architecture


![MnemeFusion Architecture](mnemefusion_architecture_v2.svg)

## Python API Reference


### Core Operations


| Method | Description |
|--------|-------------|
| `Memory(path, config=None, user=None)` | Open or create a database |
| `add(content, embedding=None, metadata=None, timestamp=None, source=None, namespace=None)` | Add a memory |
| `query(query_text, query_embedding=None, limit=10, namespace=None, filters=None)` | Multi-dimensional query returning `(intent, results, profiles)` |
| `search(query_embedding, top_k, namespace=None, filters=None)` | Pure semantic similarity search |
| `get(memory_id)` | Retrieve memory by ID |
| `delete(memory_id)` | Delete memory by ID |
| `close()` | Close database and save indexes |

### Batch Operations


| Method | Description |
|--------|-------------|
| `add_batch(memories, namespace=None)` | Bulk insert (10x+ faster) |
| `add_with_dedup(content, embedding, ...)` | Add with duplicate detection |
| `upsert(key, content, embedding, ...)` | Insert or update by logical key |
| `delete_batch(memory_ids)` | Bulk delete |

### Entity & Profile Management


| Method | Description |
|--------|-------------|
| `enable_llm_entity_extraction(model_path, tier="balanced", extraction_passes=1)` | Enable LLM extraction |
| `set_user_entity(name)` | Map first-person pronouns to user entity |
| `list_entity_profiles()` | List all entity profiles |
| `get_entity_profile(name)` | Get profile by name (case-insensitive) |
| `consolidate_profiles()` | Remove noise from profiles |
| `summarize_profiles()` | Generate profile summaries |

### Diagnostics


| Method | Description |
|--------|-------------|
| `last_query_trace()` | Step-by-step trace of the most recent `query()` call (requires `enable_trace=True` in config) |

### Metadata Filtering


```python
# Filter by metadata key-value pairs (AND logic)

filters = [
    {"metadata_key": "speaker", "metadata_value": "Alice"},
    {"metadata_key": "session", "metadata_value": "2024-01-15"},
]
intent, results, profiles = mem.query("hiking plans", filters=filters)
```

### Namespace System


```python
# Add to specific namespace

mem.add("secret note", namespace="alice")

# Query within namespace

intent, results, profiles = mem.query("notes", namespace="alice")

# Or use the user= constructor shortcut

mem = mnemefusion.Memory("brain.mfdb", user="alice")
# All add/query calls default to the "alice" namespace

```

## Configuration


```python
config = {
    "embedding_dim": 384,              # Must match your embedding model
    "entity_extraction_enabled": True,  # Enable built-in entity extraction
    "llm_model": "path/to/model.gguf", # Auto-enables LLM extraction
    "extraction_passes": 3,             # Multi-pass diverse extraction
    "async_extraction_threshold": 500,  # Defer extraction for large docs
    "enable_trace": True,               # Record step-by-step query traces
}
mem = mnemefusion.Memory("brain.mfdb", config=config)
```

```rust
use mnemefusion_core::Config;

let config = Config::new()
    .with_embedding_dim(384)
    .with_entity_extraction(true);

let engine = MemoryEngine::open("./brain.mfdb", config)?;
```

## Error Handling


All errors surface as standard Python exceptions — no custom exception types.

| Exception | When | Recoverable |
|-----------|------|-------------|
| `IOError` | Database open/close fails, disk full, file not found, concurrent open of same file | Usually yes (fix path, free disk, close other instance) |
| `ValueError` | Wrong embedding dimension, invalid memory ID, bad config | Yes (fix input) |
| `RuntimeError` | Calling methods after `close()` | Reopen with a new `Memory()` instance |

```python
import mnemefusion

mem = mnemefusion.Memory("brain.mfdb")

# After close(), all operations raise RuntimeError

mem.close()
try:
    mem.add("text")
except RuntimeError as e:
    print(e)  # "Database is closed"

# Each .mfdb file supports one open instance at a time

mem1 = mnemefusion.Memory("brain.mfdb")
try:
    mem2 = mnemefusion.Memory("brain.mfdb")  # Same file
except IOError as e:
    print(e)  # File lock error
```

## Building from Source


### Prerequisites


- Rust 1.75+
- Python 3.9+ (for Python bindings)

### Build


```bash
git clone https://github.com/gkanellopoulos/mnemefusion.git
cd mnemefusion

# Build core library

cargo build --release

# Run tests (520+ tests)

cargo test -p mnemefusion-core --lib

# Build Python bindings

cd mnemefusion-python
maturin develop --release

# With CUDA GPU support (requires CUDA toolkit)

maturin develop --release --features entity-extraction-cuda
```

## Testing


```bash
# All library unit tests

cargo test -p mnemefusion-core --lib

# With output

cargo test -p mnemefusion-core --lib -- --nocapture

# Run specific test module

cargo test -p mnemefusion-core profile
```

## Language Support


MnemeFusion's core search works with any language via multilingual embeddings. Entity extraction and intent classification are currently English-optimized.

| Feature | Language Support |
|---------|-----------------|
| Vector search | All languages (use multilingual embeddings) |
| BM25 keyword search | English-optimized (Porter stemming) |
| Temporal indexing | All languages |
| Causal links | All languages |
| Entity extraction | English (optional, can be disabled) |
| Metadata filtering | All languages |

For non-English use, disable entity extraction:

```python
config = {"entity_extraction_enabled": False, "embedding_dim": 768}
mem = mnemefusion.Memory("brain.mfdb", config=config)
```

## API Stability


MnemeFusion is pre-1.0. The following APIs are considered **stable** and will not change without a version bump:

| API | Stable Since |
|-----|-------------|
| `Memory(path, config, user)` | 0.1.0 |
| `add(content, embedding, metadata, timestamp)` | 0.1.0 |
| `query(query_text, query_embedding, limit, namespace, filters)` | 0.1.0 |
| `search(query_embedding, top_k, namespace, filters)` | 0.1.0 |
| `get(memory_id)` / `delete(memory_id)` | 0.1.0 |
| `close()` | 0.1.0 |
| `add_batch(memories, namespace)` | 0.1.0 |
| `set_embedding_fn(fn)` | 0.1.0 |

Everything else (entity extraction API, profile management, config keys) may change between minor versions. The `.mfdb` file format includes embedded version metadata — format-breaking changes will be documented in the [CHANGELOG](CHANGELOG.md).

## Performance Characteristics


| Operation | Complexity | Typical Latency |
|-----------|-----------|-----------------|
| `add()` | O(log n) HNSW insertion + O(n) BM25 update | <5ms without entity extraction |
| `add()` with LLM extraction | Same + LLM inference | ~3-9s depending on GPU |
| `query()` | O(k·log n) across all dimensions + RRF fusion | ~50ms at 5K memories, ~200ms at 50K |
| `search()` | O(k·log n) vector-only | <10ms |
| `get()` / `delete()` | O(1) key lookup | <1ms |
| Storage overhead | ~1.5-2x raw content size (384-dim embeddings) ||

Tested with up to 10K memories in a single `.mfdb` file. MnemeFusion is designed for per-entity databases — each user, contact, or conversation gets its own `.mfdb` file, typically containing 1K-10K memories. This atomic pattern keeps retrieval precise and scales horizontally.

## Contributing


Contributions are welcome! See [CONTRIBUTING.md](CONTRIBUTING.md) for build instructions, test commands, and PR guidelines.

## License


Licensed under either of:

- [Apache License, Version 2.0]LICENSE-APACHE
- [MIT License]LICENSE-MIT

at your option.

## Acknowledgments


Built on excellent open-source libraries:
- [redb]https://github.com/cberner/redb — Embedded key-value store
- [usearch]https://github.com/unum-cloud/usearch — HNSW vector search
- [petgraph]https://github.com/petgraph/petgraph — Graph algorithms
- [llama-cpp-2]https://github.com/utilityai/llama-cpp-rs — Rust bindings for llama.cpp
- [PyO3]https://github.com/PyO3/pyo3 — Rust-Python interop
- [Claude Code]https://docs.anthropic.com/en/docs/claude-code — AI-assisted development

---

**"SQLite for AI memory"** — One entity, one file. Five dimensions. Zero complexity.