tldr-core 0.1.6

Core analysis engine for TLDR code analysis tool
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
# Premortem Analysis - Pass 3: Subtle Failure Modes

**Generated:** 2026-02-03
**Focus:** Silent failures, stale data, resource leaks, scaling failures, embedding quality

---

## 1. Silent Failures

### 1.1 Embedding Model Silently Returns Garbage on Truncation

**Scenario:** When text exceeds `max_context()` (512 tokens for most models, 8192 for ArcticMLong), the model truncates without warning. User searches for "authentication flow in the complex_auth_handler function" but the function body is 2000 tokens. Only the first 512 tokens are embedded, missing the critical auth logic at the end.

**Likelihood:** HIGH
**Impact:** CRITICAL - Users get irrelevant results and don't know why

**Mitigation:**
```rust
// In embedder.rs
pub fn embed_text(&self, text: &str) -> TldrResult<EmbeddingResult> {
    let token_count = estimate_tokens(text);
    let truncated = token_count > self.model_type.max_context();
    
    let embedding = self.model.embed(text)?;
    
    Ok(EmbeddingResult {
        embedding,
        truncated,
        original_tokens: token_count,
        embedded_tokens: token_count.min(self.model_type.max_context()),
    })
}

// In SemanticSearchReport, add:
pub truncated_chunks: usize,  // Number of chunks that were truncated
pub truncation_warning: Option<String>,  // "15 chunks exceeded 512 token limit"
```

### 1.2 Hash Collision in Cache Leading to Wrong Embeddings

**Scenario:** Cache key uses MD5/SHA256 of `file_path + content_hash + model`. If two functions have identical content but different names (copy-paste code), they share cache entries. When one is updated, the other still serves stale embedding.

**Likelihood:** MEDIUM
**Impact:** MAJOR - Similarity search returns wrong results

**Mitigation:**
```rust
// Cache key must include function identity, not just content
fn cache_key(chunk: &CodeChunk, model: EmbeddingModel) -> String {
    let mut hasher = Sha256::new();
    hasher.update(chunk.file_path.to_string_lossy().as_bytes());
    hasher.update(chunk.function_name.as_deref().unwrap_or("").as_bytes());
    hasher.update(chunk.class_name.as_deref().unwrap_or("").as_bytes());
    hasher.update(&chunk.line_start.to_le_bytes());
    hasher.update(chunk.content_hash.as_bytes());
    hasher.update(model.as_str().as_bytes());
    format!("sha256:{}", hex::encode(hasher.finalize()))
}
```

### 1.3 Parse Errors Silently Skipped Without Aggregation

**Scenario:** When chunking 1000 files, 50 have syntax errors and are silently skipped. User doesn't know 5% of their codebase is not indexed. They search for a function in a broken file and find nothing.

**Likelihood:** HIGH
**Impact:** MAJOR - Silent data loss

**Mitigation:**
```rust
// In chunk_code() return type, include skipped files
pub struct ChunkResult {
    pub chunks: Vec<CodeChunk>,
    pub skipped: Vec<SkippedFile>,
    pub stats: ChunkStats,
}

pub struct SkippedFile {
    pub path: PathBuf,
    pub reason: SkipReason,
}

pub enum SkipReason {
    ParseError(String),
    BinaryFile,
    UnsupportedLanguage(String),
    TooLarge(usize),
}

// CLI should warn: "Warning: 50 files skipped due to parse errors. Use --verbose for details."
```

### 1.4 Cosine Similarity Silently Returns NaN for Zero Vectors

**Scenario:** Empty functions or whitespace-only content produce zero vectors. `cosine_similarity(zero, any)` involves division by zero, returning NaN. This NaN propagates through `top_k_similar`, corrupting results.

**Likelihood:** MEDIUM
**Impact:** MAJOR - Entire search returns NaN scores

**Mitigation:**
```rust
pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f64 {
    assert_eq!(a.len(), b.len(), "Vectors must have same length");
    
    let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
    let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
    let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
    
    // Handle zero vectors explicitly
    if norm_a < 1e-9 || norm_b < 1e-9 {
        return 0.0;  // Zero vector has 0 similarity with everything
    }
    
    (dot / (norm_a * norm_b)) as f64
}
```

---

## 2. Stale Data

### 2.1 Cache TTL Not Checked on Read

**Scenario:** Cache has `ttl_days: 30` but `get()` never checks timestamps. 6-month-old embeddings are served. Meanwhile, the embedding model was updated (ArcticM v1 -> v2), producing incompatible vectors. Old and new embeddings are compared, yielding meaningless scores.

**Likelihood:** HIGH
**Impact:** CRITICAL - Completely wrong search results

**Mitigation:**
```rust
impl EmbeddingCache {
    pub fn get(&self, chunk: &CodeChunk, model: EmbeddingModel) -> Option<Vec<f32>> {
        let key = cache_key(chunk, model);
        let entry = self.entries.get(&key)?;
        
        // Check TTL
        let now = SystemTime::now()
            .duration_since(UNIX_EPOCH)
            .unwrap()
            .as_secs();
        let age_days = (now - entry.cached_at) / 86400;
        if age_days > self.config.ttl_days as u64 {
            return None;  // Stale, force re-embed
        }
        
        // Check model version (must match exactly)
        if entry.model != model || entry.model_version != model.version() {
            return None;
        }
        
        Some(entry.embedding.clone())
    }
}

// Add version tracking to EmbeddingModel
impl EmbeddingModel {
    pub fn version(&self) -> &'static str {
        // Track fastembed model versions
        match self {
            Self::ArcticM => "5.8.0",  // Update when fastembed updates
            // ...
        }
    }
}
```

### 2.2 File Modification Not Detected Between Runs

**Scenario:** User runs `tldr embed src/`, gets cached embeddings. They modify `src/config.rs`, then run `tldr semantic "config parser"`. The search uses stale embeddings because only `content_hash` is checked, not file modification time.

**Likelihood:** HIGH (common workflow)
**Impact:** MAJOR - Search misses recent changes

**Mitigation:**
```rust
// In chunker.rs, include mtime in chunk
pub struct CodeChunk {
    // ... existing fields
    pub file_mtime: u64,  // File modification time
}

// In cache.rs, validate mtime
impl EmbeddingCache {
    pub fn get(&self, chunk: &CodeChunk, model: EmbeddingModel) -> Option<Vec<f32>> {
        let key = cache_key(chunk, model);
        let entry = self.entries.get(&key)?;
        
        // If content hash differs OR file is newer, invalidate
        if entry.content_hash != chunk.content_hash {
            return None;
        }
        if entry.file_mtime < chunk.file_mtime {
            // File was modified, content hash might be stale
            // (edge case: file touched but not changed)
            return None;
        }
        
        Some(entry.embedding.clone())
    }
}
```

### 2.3 Index Not Invalidated When Underlying Files Change

**Scenario:** User builds `SemanticIndex`, keeps it in memory, edits files, then searches. Index has stale embeddings because it was built from a snapshot.

**Likelihood:** MEDIUM (programmatic use)
**Impact:** MINOR - Mostly affects library users, not CLI

**Mitigation:**
```rust
impl SemanticIndex {
    /// Check if any indexed file has changed since index was built
    pub fn is_stale(&self) -> bool {
        for chunk in &self.chunks {
            let path = self.root.join(&chunk.chunk.file_path);
            if let Ok(meta) = fs::metadata(&path) {
                if let Ok(mtime) = meta.modified() {
                    let file_mtime = mtime.duration_since(UNIX_EPOCH).unwrap().as_secs();
                    if file_mtime > chunk.embedded_at {
                        return true;
                    }
                }
            }
        }
        false
    }
    
    /// Rebuild only changed files
    pub fn refresh(&mut self, cache: Option<&mut EmbeddingCache>) -> TldrResult<RefreshStats>;
}
```

---

## 3. Resource Leaks

### 3.1 Embedder Model Not Released After Use

**Scenario:** Each `Embedder::new()` loads a 110MB model into memory. CLI creates a new Embedder per command invocation. In long-running processes or repeated calls, memory grows unbounded.

**Likelihood:** MEDIUM
**Impact:** MAJOR - OOM in server/daemon modes

**Mitigation:**
```rust
// Use lazy_static or OnceCell for model singleton
use once_cell::sync::OnceCell;

static EMBEDDER: OnceCell<Embedder> = OnceCell::new();

impl Embedder {
    /// Get or initialize the global embedder
    pub fn global(model: EmbeddingModel) -> TldrResult<&'static Embedder> {
        EMBEDDER.get_or_try_init(|| Embedder::new(model))
    }
    
    /// Clear the global embedder (for testing or model switching)
    pub fn clear_global() {
        // Note: OnceCell doesn't support this; use parking_lot::RwLock<Option<>>
    }
}

// Alternative: Add Drop impl with explicit cleanup
impl Drop for Embedder {
    fn drop(&mut self) {
        // fastembed should handle this, but verify
        tracing::debug!("Embedder dropped, model released");
    }
}
```

### 3.2 Cache File Left Locked on Panic

**Scenario:** `EmbeddingCache::flush()` acquires file lock, starts writing, then panics mid-write (e.g., disk full). Lock is never released. Future runs hang waiting for lock.

**Likelihood:** LOW
**Impact:** MAJOR - Complete system hang

**Mitigation:**
```rust
impl EmbeddingCache {
    pub fn flush(&mut self) -> TldrResult<()> {
        if !self.dirty {
            return Ok(());
        }
        
        // Use temp file + atomic rename
        let temp_path = self.path.with_extension("tmp");
        
        // Lock scope ensures release even on panic
        {
            let file = File::create(&temp_path)?;
            let _lock = file.try_lock_exclusive()
                .map_err(|_| TldrError::CacheLocked)?;
            
            serde_json::to_writer(&file, &self.as_cache_file())?;
            file.sync_all()?;
        }  // Lock released here
        
        // Atomic rename (safe even if we panic after)
        fs::rename(&temp_path, &self.path)?;
        self.dirty = false;
        Ok(())
    }
}
```

### 3.3 Temporary Files Not Cleaned Up on Error

**Scenario:** `chunk_code()` writes intermediate results to temp files. If embedding fails mid-batch, temp files remain in `/tmp/tldr-embed-*`.

**Likelihood:** LOW
**Impact:** MINOR - Disk space leak over time

**Mitigation:**
```rust
// Use tempfile crate with auto-cleanup
use tempfile::NamedTempFile;

fn process_batch(chunks: &[CodeChunk]) -> TldrResult<Vec<EmbeddedChunk>> {
    // NamedTempFile automatically deletes on drop
    let temp = NamedTempFile::new()?;
    
    // Even if we return Err, temp is cleaned up
    // ...
}

// Add cleanup on startup
impl EmbeddingCache {
    pub fn open(config: CacheConfig) -> TldrResult<Self> {
        // Clean up orphaned temp files
        for entry in fs::read_dir(&config.cache_dir)? {
            let entry = entry?;
            if entry.path().extension() == Some("tmp".as_ref()) {
                let _ = fs::remove_file(entry.path());
            }
        }
        // ...
    }
}
```

---

## 4. Scaling Failures

### 4.1 Linear Scan Becomes Unusable at Scale

**Scenario:** At 100K functions, each search does 100K * 768 = 76.8M float operations. At 1M functions, search takes 10+ seconds.

**Likelihood:** HIGH (inevitable at scale)
**Impact:** CRITICAL - Unusable for large codebases

**Mitigation:**
```rust
// Add approximate nearest neighbor (ANN) index for large codebases
pub struct SemanticIndex {
    chunks: Vec<EmbeddedChunk>,
    
    // For small indexes: linear scan
    // For large indexes: HNSW graph
    ann_index: Option<HnswIndex>,
    
    // Threshold for switching to ANN
    const ANN_THRESHOLD: usize = 10_000,
}

impl SemanticIndex {
    pub fn search(&self, query: &str, options: SearchOptions) -> TldrResult<SemanticSearchReport> {
        if self.chunks.len() > Self::ANN_THRESHOLD {
            self.search_ann(query, options)
        } else {
            self.search_linear(query, options)
        }
    }
    
    fn search_ann(&self, query: &str, options: SearchOptions) -> TldrResult<SemanticSearchReport> {
        // Use HNSW for O(log n) search
        // Consider: usearch-rs, hora, hnswlib-rs
        todo!("Add ANN support for large codebases")
    }
}

// Spec addition: Performance guarantees
// - < 10K chunks: Linear scan, <100ms
// - 10K-100K chunks: HNSW, <200ms
// - > 100K chunks: Warn and suggest filtering, or use external vector DB
```

### 4.2 Memory Exhaustion Building Index

**Scenario:** `SemanticIndex::build()` loads all chunks into memory, then embeds all, then builds index. For 1M functions with 768-dim embeddings: 1M * 768 * 4 bytes = 3GB just for embeddings.

**Likelihood:** HIGH at scale
**Impact:** CRITICAL - OOM crash

**Mitigation:**
```rust
impl SemanticIndex {
    pub fn build(
        root: &Path,
        chunk_options: ChunkOptions,
        embed_options: EmbedOptions,
        cache: Option<&mut EmbeddingCache>,
    ) -> TldrResult<Self> {
        // Check estimated memory requirement
        let chunk_count = estimate_chunk_count(root, &chunk_options)?;
        let estimated_memory = chunk_count * embed_options.model.dimensions() * 4;
        
        if estimated_memory > MAX_INDEX_MEMORY {
            return Err(TldrError::IndexTooLarge {
                chunks: chunk_count,
                estimated_mb: estimated_memory / 1_000_000,
                suggestion: "Use --filter to reduce scope, or consider external vector DB".into(),
            });
        }
        
        // Stream chunks instead of loading all at once
        let chunks_iter = chunk_code_streaming(root, chunk_options);
        
        // Embed in batches
        let mut embedded = Vec::with_capacity(chunk_count);
        for batch in chunks_iter.chunks(embed_options.batch_size) {
            let batch_embedded = embed_batch_with_cache(batch, cache)?;
            embedded.extend(batch_embedded);
        }
        
        // ...
    }
}

const MAX_INDEX_MEMORY: usize = 500 * 1024 * 1024;  // 500MB default
```

### 4.3 Cache File Grows Unbounded

**Scenario:** Cache never evicts. After indexing 100 projects over 6 months, cache is 10GB. Parsing the JSON on startup takes 30 seconds.

**Likelihood:** HIGH
**Impact:** MAJOR - Degraded startup performance

**Mitigation:**
```rust
impl EmbeddingCache {
    pub fn open(config: CacheConfig) -> TldrResult<Self> {
        let path = config.cache_dir.join("embeddings.json");
        
        // Check cache size before loading
        if let Ok(meta) = fs::metadata(&path) {
            let size_mb = meta.len() / 1_000_000;
            if size_mb > config.max_size_mb as u64 {
                tracing::warn!(
                    "Cache exceeds max size ({} MB > {} MB), running eviction",
                    size_mb, config.max_size_mb
                );
                Self::evict_to_size(&path, config.max_size_mb)?;
            }
        }
        
        // Load with size limit
        let entries = Self::load_entries(&path, config.max_entries)?;
        
        Ok(Self { entries, dirty: false, config, path })
    }
    
    fn evict_to_size(path: &Path, max_mb: usize) -> TldrResult<()> {
        // LRU eviction: remove oldest entries until under limit
        // ...
    }
}

// Also: Use sharded cache files by project root hash
// ~/.cache/tldr/embeddings/abc123.json (per-project)
```

---

## 5. Embedding Quality Failures

### 5.1 Code Embeddings Don't Capture Semantic Intent

**Scenario:** User searches "function that validates email addresses". The codebase has `fn check_email_format(s: &str) -> bool` but the embedding model (trained on natural language) doesn't associate "validate" with "check" or "email addresses" with "email_format".

**Likelihood:** MEDIUM (model-dependent)
**Impact:** MAJOR - Poor search relevance

**Mitigation:**
```rust
// Augment code with natural language descriptions before embedding
fn prepare_text_for_embedding(chunk: &CodeChunk) -> String {
    let mut text = String::new();
    
    // Add function signature in natural language form
    if let Some(func) = &chunk.function_name {
        // Split camelCase/snake_case into words
        let words = split_identifier(func);
        text.push_str(&format!("Function: {}\n", words.join(" ")));
    }
    
    // Add docstring if available (already in spec: include_docs option)
    if let Some(doc) = extract_docstring(&chunk.content, chunk.language) {
        text.push_str(&format!("Description: {}\n", doc));
    }
    
    // Add the code
    text.push_str("Code:\n");
    text.push_str(&chunk.content);
    
    text
}

// Test embedding quality with gold-standard queries
#[cfg(test)]
mod embedding_quality_tests {
    #[test]
    fn semantic_similar_names_rank_higher() {
        // "validate email" should rank "check_email_format" above "process_data"
    }
    
    #[test]
    fn docstring_improves_ranking() {
        // Function with "validates email" in docstring should rank higher
    }
}
```

### 5.2 Different Languages Produce Incompatible Embeddings

**Scenario:** User searches across Python and Rust code. Same algorithm in both languages produces very different embeddings because embedding model hasn't seen code. Python `def foo():` vs Rust `fn foo() {}` have different surface forms.

**Likelihood:** MEDIUM
**Impact:** MINOR - Cross-language search less effective

**Mitigation:**
```rust
// Option 1: Normalize code syntax before embedding
fn normalize_code_for_embedding(content: &str, language: Language) -> String {
    // Pseudocode normalization (strip syntax-specific tokens)
    // "fn foo() -> i32 { return 42; }" -> "function foo returns integer: return 42"
    // This is complex; may not be worth it
    todo!("Consider for future enhancement")
}

// Option 2: Use code-specific embedding model
pub enum EmbeddingModel {
    // General text models
    ArcticM,
    
    // Code-specific models (future)
    CodeBERT,
    StarCoder,
    
    // Hybrid: Use code model for code, text model for queries
}

// Option 3: Document limitation and suggest language filtering
// CLI: "For best cross-language results, use --lang to filter"
```

### 5.3 Short Functions Have Poor Embeddings

**Scenario:** One-liner functions like `fn is_empty(&self) -> bool { self.len() == 0 }` produce poor embeddings because there's not enough context. Search for "check if empty" misses these.

**Likelihood:** HIGH
**Impact:** MINOR - One-liners often found by name search anyway

**Mitigation:**
```rust
// Augment short functions with context
fn prepare_text_for_embedding(chunk: &CodeChunk) -> String {
    let content_len = chunk.content.len();
    
    if content_len < 100 {  // Short function
        // Add class context if available
        let context = if let Some(class) = &chunk.class_name {
            format!("Method of class {}\n", class)
        } else {
            String::new()
        };
        
        // Add expanded description
        let expanded = expand_short_function(&chunk.content, chunk.language);
        
        format!("{}{}\nCode: {}", context, expanded, chunk.content)
    } else {
        chunk.content.clone()
    }
}

fn expand_short_function(content: &str, lang: Language) -> String {
    // "fn is_empty(&self) -> bool { self.len() == 0 }"
    // -> "Checks if empty by comparing length to zero. Returns boolean."
    // Use simple heuristics, not LLM
    todo!("Implement short function expansion")
}
```

### 5.4 Query-Document Mismatch

**Scenario:** Model is symmetric (same embedding for query and document). But optimal retrieval uses asymmetric models where queries and documents have different embedding spaces. User queries like "how to authenticate" don't match document embeddings of actual auth code.

**Likelihood:** MEDIUM
**Impact:** MINOR - Arctic models handle this reasonably well

**Mitigation:**
```rust
// Use query-specific prefix as recommended by Snowflake Arctic docs
impl Embedder {
    pub fn embed_query(&self, query: &str) -> TldrResult<Vec<f32>> {
        // Arctic models use different prefixes for queries vs documents
        let prefixed = format!("query: {}", query);
        self.embed_text(&prefixed)
    }
    
    pub fn embed_document(&self, doc: &str) -> TldrResult<Vec<f32>> {
        // Documents get different prefix
        let prefixed = format!("passage: {}", doc);
        self.embed_text(&prefixed)
    }
}

// Update all call sites:
// - Index building: embed_document()
// - Search: embed_query()
```

---

## Summary: Critical Mitigations to Add to Spec

### Must Have (P0)

1. **Truncation warning** - Tell users when chunks exceed token limit
2. **Cache TTL enforcement** - Check timestamps on every read
3. **Zero vector handling** - Return 0.0 similarity, don't NaN
4. **Skipped files reporting** - Surface parse errors to user
5. **Memory limit check** - Fail fast before OOM

### Should Have (P1)

6. **File mtime validation** - Detect changes between runs
7. **Atomic cache writes** - Prevent corruption on crash
8. **Index staleness check** - Warn when files changed since build
9. **Query/document prefixes** - Use proper Arctic embedding format
10. **Cache size limit** - LRU eviction when over limit

### Nice to Have (P2)

11. **ANN index** - HNSW for >10K chunks
12. **Short function augmentation** - Better embeddings for one-liners
13. **Model singleton** - Avoid repeated model loading
14. **Per-project cache sharding** - Faster cache operations

---

## Spec Additions Required

Add to Section 6 (Error Handling):
```markdown
### 6.4 Graceful Handling of Edge Cases

| Edge Case | Behavior |
|-----------|----------|
| Text exceeds max_context | Truncate + set `truncated=true` in result |
| Zero/empty embedding | Return zero vector, similarity=0.0 |
| Cache entry expired | Cache miss, re-embed |
| File modified since index | Warning, suggest rebuild |
| Index exceeds memory limit | Error with suggestion to filter |
```

Add to Section 8 (Performance):
```markdown
### 8.5 Scaling Limits

| Scale | Search Method | Expected Latency |
|-------|---------------|------------------|
| < 10K chunks | Linear scan | < 100ms |
| 10K-100K chunks | Linear scan + warning | < 1s |
| > 100K chunks | Error, suggest filtering | N/A |

Future: Add HNSW index for O(log n) search at scale.
```

Add to Section 5 (Behavioral Contracts):
```markdown
### 5.5 Embedding Contracts

| Input | Output | Metadata |
|-------|--------|----------|
| Text > max_context | Truncated embedding | `truncated: true` |
| Empty text | Zero vector | `is_zero: true` |
| Query text | Embedding with "query:" prefix | - |
| Document text | Embedding with "passage:" prefix | - |
```