engramai 0.2.3

Neuroscience-grounded memory system for AI agents. ACT-R activation, Hebbian learning, Ebbinghaus forgetting, cognitive consolidation.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
# Engram Embedding Protocol Specification v2

**Status:** Canonical  
**Version:** 2.0  
**Last Updated:** 2026-04-02

---

## 1. Overview

This document defines the **Engram Embedding Protocol v2**, the authoritative specification for storing and managing vector embeddings across all Engram implementations (Rust, Python, and future languages).

All implementations **MUST** conform to this specification to ensure compatibility and data portability.

### 1.1 Key Changes from v1

- **Multi-model support:** Primary key changed from `(memory_id)` to `(memory_id, model)`
- **Binary-only storage:** BLOB format required; TEXT/JSON storage is prohibited
- **Strict validation:** Finite value enforcement and dimension checking
- **Model namespacing:** Structured provider/model naming convention

---

## 2. Database Schema

### 2.1 Table Definition

```sql
CREATE TABLE IF NOT EXISTS memory_embeddings (
    memory_id   TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    model       TEXT NOT NULL,
    embedding   BLOB NOT NULL,
    dimensions  INTEGER NOT NULL,
    created_at  TEXT NOT NULL,
    PRIMARY KEY (memory_id, model)
);

CREATE INDEX IF NOT EXISTS idx_embeddings_model ON memory_embeddings(model);
```

### 2.2 Column Specifications

| Column | Type | Constraints | Description |
|--------|------|-------------|-------------|
| `memory_id` | TEXT | NOT NULL, FK to memories(id) | UUID or unique identifier of the memory |
| `model` | TEXT | NOT NULL | Namespaced model identifier (see §3) |
| `embedding` | BLOB | NOT NULL | Raw binary f32 vector (see §4) |
| `dimensions` | INTEGER | NOT NULL | Vector dimension count (must match blob size) |
| `created_at` | TEXT | NOT NULL | ISO-8601 UTC timestamp |

### 2.3 Constraints

- **Primary Key:** Composite `(memory_id, model)` enables multiple embeddings per memory
- **Foreign Key:** `memory_id` references `memories(id)` with `ON DELETE CASCADE`
- **Index:** `idx_embeddings_model` optimizes model-scoped queries

---

## 3. Model Naming Convention

### 3.1 Format

Model identifiers **MUST** follow the pattern:

```
{provider}/{model_name}
```

### 3.2 Rules

1. **Immutability:** Once written, model strings are immutable (never update in place)
2. **Case-sensitivity:** Model names are case-sensitive
3. **No whitespace:** Use hyphens or underscores for multi-word names
4. **Provider namespace:** Prevents collisions across different embedding services

### 3.3 Canonical Models

| Model String | Dimensions | Provider |
|--------------|------------|----------|
| `ollama/nomic-embed-text` | 768 | Ollama (local) |
| `ollama/mxbai-embed-large` | 1024 | Ollama (local) |
| `openai/text-embedding-3-small` | 1536 | OpenAI |
| `openai/text-embedding-3-large` | 3072 | OpenAI |
| `openai/text-embedding-ada-002` | 1536 | OpenAI (legacy) |
| `local/minilm-l6-v2` | 384 | Local transformer |
| `cohere/embed-english-v3.0` | 1024 | Cohere |
| `voyage/voyage-2` | 1024 | Voyage AI |

### 3.4 Unknown/Legacy Models

Use `unknown/legacy` for migrated embeddings with unspecified models.

---

## 4. BLOB Format Specification

### 4.1 Binary Layout

Embeddings are stored as **raw little-endian IEEE 754 binary32 (f32) arrays**:

- **No header, no wrapper, no JSON**
- Byte length: `dimensions × 4`
- Endianness: Little-endian on all platforms

### 4.2 Serialization

**Rust:**
```rust
let bytes: Vec<u8> = embedding.iter().flat_map(|f| f.to_le_bytes()).collect();
```

**Python:**
```python
import struct
blob = struct.pack(f'<{len(embedding)}f', *embedding)
```

### 4.3 Deserialization

**Rust:**
```rust
fn bytes_to_f32_vec(bytes: &[u8]) -> Vec<f32> {
    bytes.chunks_exact(4)
        .map(|c| f32::from_le_bytes(c.try_into().unwrap()))
        .collect()
}
```

**Python:**
```python
import struct
n = len(blob) // 4
embedding = list(struct.unpack(f'<{n}f', blob))
```

### 4.4 Validation Rules

Implementations **MUST** validate blobs before writing and after reading:

1. **Alignment:** `len(blob) % 4 == 0`
2. **Dimension match:** `len(blob) / 4 == dimensions`
3. **Finite values:** All f32 values must be finite (no NaN, no Inf)

**Validation pseudocode:**
```
function validate_blob(blob, expected_dimensions):
    assert len(blob) % 4 == 0, "Blob length must be multiple of 4"
    actual_dimensions = len(blob) / 4
    assert actual_dimensions == expected_dimensions, "Dimension mismatch"
    
    values = deserialize(blob)
    for value in values:
        assert is_finite(value), "Non-finite value detected"
```

---

## 5. Write Protocol

### 5.1 Insert/Update Query

```sql
INSERT OR REPLACE INTO memory_embeddings 
    (memory_id, model, embedding, dimensions, created_at) 
VALUES (?, ?, ?, ?, ?);
```

### 5.2 Parameter Binding

1. `memory_id`: String (UUID or identifier from `memories.id`)
2. `model`: String (following §3 naming convention)
3. `embedding`: BLOB (validated per §4.4)
4. `dimensions`: Integer (must equal `len(embedding) / 4`)
5. `created_at`: String (ISO-8601 UTC, e.g., `2026-04-02T05:26:34.123Z`)

### 5.3 Semantics

- `INSERT OR REPLACE` is correct: re-embedding the same (memory_id, model) pair updates the row
- Each write **MUST** validate the blob before insertion
- The `created_at` timestamp reflects the embedding generation time, not the memory creation time

### 5.4 Example

```rust
// Rust example
let timestamp = Utc::now().to_rfc3339();
let blob = serialize_embedding(&vector); // Per §4.2

db.execute(
    "INSERT OR REPLACE INTO memory_embeddings VALUES (?, ?, ?, ?, ?)",
    params![memory_id, "ollama/nomic-embed-text", blob, 768, timestamp]
)?;
```

---

## 6. Read Protocol

### 6.1 Single Memory Query

```sql
SELECT embedding, dimensions 
FROM memory_embeddings 
WHERE memory_id = ? AND model = ?;
```

Returns the embedding for a specific memory using a specific model.

### 6.2 All Embeddings for Model

```sql
SELECT memory_id, embedding 
FROM memory_embeddings 
WHERE model = ?;
```

Returns all embeddings generated with a specific model (used for recall).

### 6.3 Post-Read Validation

After reading, implementations **MUST** validate:

```
assert len(blob) == dimensions * 4
```

### 6.4 Cross-Model Comparison

**UNDEFINED BEHAVIOR:** Never compute cosine similarity between vectors from different models.

Implementations **MUST** filter by model before similarity computations.

---

## 7. Recall Protocol

### 7.1 Query Embedding

1. Embed the query text using the **currently configured model**
2. Retrieve all embeddings for that model only

### 7.2 Similarity Computation

```sql
SELECT memory_id, embedding 
FROM memory_embeddings 
WHERE model = ?;  -- Current model only
```

3. Deserialize each embedding
4. Compute cosine similarity against the query vector
5. Rank by similarity score

### 7.3 Coverage Check

Before recall, implementations **SHOULD** check embedding coverage:

```sql
SELECT 
    (SELECT COUNT(*) FROM memory_embeddings WHERE model = ?) AS embedded_count,
    (SELECT COUNT(*) FROM memories) AS total_count;
```

**Warning condition:** If `embedded_count / total_count < 0.5`, log:
```
WARNING: Only {percentage}% of memories have embeddings for model {model}.
Consider running backfill to improve recall quality.
```

### 7.4 Empty Result Handling

If no embeddings exist for the current model, return empty results (do not fall back to other models).

---

## 8. Migration Protocol (v1 → v2)

### 8.1 Detection

Detect v1 schema by inspecting table structure:
- Primary key is only `(memory_id)`
- May have `embedding` column of type TEXT

### 8.2 Migration Steps

```sql
-- Step 1: Create new table
CREATE TABLE memory_embeddings_v2 (
    memory_id   TEXT NOT NULL REFERENCES memories(id) ON DELETE CASCADE,
    model       TEXT NOT NULL,
    embedding   BLOB NOT NULL,
    dimensions  INTEGER NOT NULL,
    created_at  TEXT NOT NULL,
    PRIMARY KEY (memory_id, model)
);

-- Step 2: Migrate BLOB rows (if model column exists)
INSERT INTO memory_embeddings_v2 
    (memory_id, model, embedding, dimensions, created_at)
SELECT memory_id, model, embedding, dimensions, created_at 
FROM memory_embeddings 
WHERE typeof(embedding) = 'blob';

-- Step 3: Handle TEXT/JSON rows
-- (Implementation-specific: parse JSON, convert to BLOB)

-- Step 4: Handle missing metadata
-- For rows without model: use 'unknown/legacy'
-- For rows without dimensions: infer from blob size

-- Step 5: Replace old table
DROP TABLE memory_embeddings;
ALTER TABLE memory_embeddings_v2 RENAME TO memory_embeddings;

-- Step 6: Create index
CREATE INDEX idx_embeddings_model ON memory_embeddings(model);
```

### 8.3 TEXT to BLOB Conversion

**Python example:**
```python
import json
import struct

# Read TEXT embedding
text_embedding = row['embedding']
vector = json.loads(text_embedding)  # Parse JSON array

# Convert to BLOB
blob = struct.pack(f'<{len(vector)}f', *vector)
dimensions = len(vector)

# Insert into v2 table
cursor.execute(
    "INSERT INTO memory_embeddings_v2 VALUES (?, ?, ?, ?, ?)",
    (memory_id, model or 'unknown/legacy', blob, dimensions, created_at)
)
```

### 8.4 Version Tracking

After migration, update protocol version:

```sql
CREATE TABLE IF NOT EXISTS engram_meta (
    key TEXT PRIMARY KEY,
    value TEXT NOT NULL
);

INSERT OR REPLACE INTO engram_meta (key, value) 
VALUES ('embedding_protocol_version', '2');
```

---

## 9. Model Switching Protocol

### 9.1 Principles

- **New memories:** Use currently configured model
- **Old memories:** Retain original model embeddings (immutable)
- **Multi-model storage:** Each memory can have embeddings from multiple models

### 9.2 Configuration Change

When the active model changes from `model_A` to `model_B`:

1. **Immediate effect:** New memories are embedded with `model_B`
2. **Existing embeddings:** Remain unchanged (both `model_A` and `model_B` rows can coexist)
3. **Recall:** Uses only `model_B` embeddings

### 9.3 Backfill Process

To improve recall coverage after model switch:

```python
# Query memories without embeddings for new model
missing_memories = db.execute("""
    SELECT id, content 
    FROM memories 
    WHERE id NOT IN (
        SELECT memory_id 
        FROM memory_embeddings 
        WHERE model = ?
    )
""", [new_model]).fetchall()

# Embed and insert
for memory in missing_memories:
    vector = embed(memory.content, new_model)
    blob = serialize(vector)
    db.execute(
        "INSERT INTO memory_embeddings VALUES (?, ?, ?, ?, ?)",
        [memory.id, new_model, blob, len(vector), now()]
    )
```

### 9.4 Cleanup (Optional)

After full re-embedding, old model embeddings can be removed:

```sql
-- WARNING: This is irreversible
DELETE FROM memory_embeddings WHERE model = ?;  -- Old model
```

**Recommendation:** Retain old embeddings for audit/rollback purposes unless storage is constrained.

---

## 10. Reference Implementations

### 10.1 Rust

**Serialization:**
```rust
use std::convert::TryInto;

fn serialize_embedding(embedding: &[f32]) -> Vec<u8> {
    embedding.iter()
        .flat_map(|f| f.to_le_bytes())
        .collect()
}
```

**Deserialization:**
```rust
fn deserialize_embedding(bytes: &[u8]) -> Result<Vec<f32>, String> {
    if bytes.len() % 4 != 0 {
        return Err("Invalid blob length".to_string());
    }
    
    let vec: Vec<f32> = bytes
        .chunks_exact(4)
        .map(|chunk| f32::from_le_bytes(chunk.try_into().unwrap()))
        .collect();
    
    // Validate finite
    if vec.iter().any(|f| !f.is_finite()) {
        return Err("Non-finite value in embedding".to_string());
    }
    
    Ok(vec)
}
```

**Write:**
```rust
use chrono::Utc;
use rusqlite::params;

fn store_embedding(
    conn: &Connection,
    memory_id: &str,
    model: &str,
    embedding: &[f32],
) -> Result<()> {
    let blob = serialize_embedding(embedding);
    let dimensions = embedding.len() as i64;
    let created_at = Utc::now().to_rfc3339();
    
    conn.execute(
        "INSERT OR REPLACE INTO memory_embeddings VALUES (?, ?, ?, ?, ?)",
        params![memory_id, model, blob, dimensions, created_at],
    )?;
    
    Ok(())
}
```

### 10.2 Python

**Serialization:**
```python
import struct

def serialize_embedding(embedding: list[float]) -> bytes:
    """Convert float list to little-endian f32 BLOB."""
    return struct.pack(f'<{len(embedding)}f', *embedding)
```

**Deserialization:**
```python
import struct
import math

def deserialize_embedding(blob: bytes) -> list[float]:
    """Convert BLOB to float list with validation."""
    if len(blob) % 4 != 0:
        raise ValueError("Invalid blob length")
    
    n = len(blob) // 4
    embedding = list(struct.unpack(f'<{n}f', blob))
    
    # Validate finite
    if not all(math.isfinite(v) for v in embedding):
        raise ValueError("Non-finite value in embedding")
    
    return embedding
```

**Write:**
```python
from datetime import datetime, timezone

def store_embedding(
    conn,
    memory_id: str,
    model: str,
    embedding: list[float]
) -> None:
    blob = serialize_embedding(embedding)
    dimensions = len(embedding)
    created_at = datetime.now(timezone.utc).isoformat()
    
    conn.execute(
        "INSERT OR REPLACE INTO memory_embeddings VALUES (?, ?, ?, ?, ?)",
        (memory_id, model, blob, dimensions, created_at)
    )
    conn.commit()
```

---

## 11. Versioning and Compatibility

### 11.1 Current Version

**Protocol Version:** 2  
**Metadata Storage:**

```sql
CREATE TABLE IF NOT EXISTS engram_meta (
    key TEXT PRIMARY KEY,
    value TEXT NOT NULL
);

INSERT OR REPLACE INTO engram_meta (key, value) 
VALUES ('embedding_protocol_version', '2');
```

### 11.2 Version Detection

Implementations **MUST** check protocol version on initialization:

```sql
SELECT value FROM engram_meta WHERE key = 'embedding_protocol_version';
```

- If missing or `< 2`: Trigger migration (see §8)
- If `> 2`: Log warning about newer protocol version

### 11.3 Future Changes

Future protocol updates will:
1. Increment version number
2. Document breaking changes
3. Provide automated migration paths
4. Maintain backward-read compatibility where possible

---

## 12. Error Handling

### 12.1 Validation Failures

| Error | Cause | Action |
|-------|-------|--------|
| `BLOB_LENGTH_INVALID` | `len(blob) % 4 != 0` | Reject write, log error |
| `DIMENSION_MISMATCH` | `len(blob) / 4 != dimensions` | Reject write, log error |
| `NON_FINITE_VALUE` | NaN or Inf detected | Reject write, log error |
| `MODEL_NAME_INVALID` | Missing `/` separator | Reject write, suggest format |

### 12.2 Migration Failures

If migration encounters unrecoverable data:
1. Log detailed error with `memory_id`
2. Skip corrupted row
3. Continue migration
4. Report summary of skipped rows

### 12.3 Read Failures

If blob validation fails on read:
1. Log error with `memory_id` and `model`
2. Return error (do not silently skip)
3. Consider database corruption

---

## 13. Performance Considerations

### 13.1 Indexing

The `idx_embeddings_model` index is **critical** for recall performance:
- Enables O(log n) model filtering
- Required for efficient backfill queries

### 13.2 Blob Size

Typical embedding sizes:
- 384D (MiniLM): 1.5 KB per embedding
- 768D (Nomic): 3.1 KB per embedding
- 1536D (OpenAI): 6.1 KB per embedding

For 10,000 memories with 768D embeddings: ~31 MB storage.

### 13.3 Batch Operations

For backfill, use transactions:

```python
with conn:  # Transaction context
    for memory in batch:
        store_embedding(conn, memory.id, model, vector)
```

Batch size recommendation: 100-1000 embeddings per transaction.

---

## 14. Security Considerations

### 14.1 Model String Injection

Model strings are user-controlled. Implementations **MUST**:
- Use parameterized queries (never string concatenation)
- Validate format (contains exactly one `/`)
- Limit length (e.g., 256 characters)

### 14.2 Blob Validation

Always validate blobs before deserialization to prevent:
- Buffer overflows (Rust: use `chunks_exact`)
- Memory exhaustion (check dimensions are reasonable)
- NaN/Inf propagation in similarity computations

---

## 15. Compliance

### 15.1 Implementation Checklist

- [ ] Schema matches §2 exactly
- [ ] Model naming follows §3 convention
- [ ] BLOB serialization per §4.2
- [ ] BLOB validation per §4.4
- [ ] Write protocol per §5
- [ ] Read protocol per §6
- [ ] Recall filters by model per §7
- [ ] Migration from v1 supported per §8
- [ ] Version tracking in `engram_meta`
- [ ] Reference implementation tests pass

### 15.2 Testing Requirements

Implementations **MUST** pass:
1. Round-trip serialization (serialize → deserialize → compare)
2. Cross-implementation compatibility (Rust ↔ Python)
3. Validation rejection (NaN, Inf, wrong dimensions)
4. Multi-model storage (same memory, different models)
5. Migration from v1 schema

---

## 16. Changelog

### Version 2.0 (2026-04-02)
- Initial formal specification
- Multi-model support via composite primary key
- Binary-only BLOB format requirement
- Structured model naming convention
- Migration protocol from v1

---

## 17. Authors and Governance

**Maintainer:** Engram Core Team  
**License:** MIT  
**Feedback:** Submit issues to engram-ai-rust repository

All implementations are expected to contribute test cases demonstrating compliance with this specification.

---

**End of Specification**