embeddenator 0.20.0-alpha.1

Sparse ternary VSA holographic computing substrate
# Retrieval Index (Semantic Search)

This document defines the project’s first *robust, elegant* semantic retrieval index for sparse ternary VSA vectors.

## Why cosine similarity is not enough
Cosine similarity is a scoring function. On its own, it requires comparing the query to **every** stored vector (linear scan).

To scale, we separate retrieval into:
1. **Candidate generation** (sub-linear indexing)
2. **Exact reranking** (cosine / dot on a small candidate set)

## Current implementation
- Module: `src/retrieval.rs`
- Type: `TernaryInvertedIndex`

### Data structure
For each dimension $d \in [0,\mathrm{DIM})$:
- `pos_postings[d]`: IDs with $+1$ at dimension $d$
- `neg_postings[d]`: IDs with $-1$ at dimension $d$

### Query scoring
For a query vector $q$:
- Iterate postings for every $d \in q.pos$ and $d \in q.neg$
- Accumulate sparse ternary dot contributions into integer scores
- Return top-$k$ by score

This yields candidate generation cost proportional to postings touched, not total corpus size.

## Next steps (planned)
- Rerank stage:
  - Use exact cosine similarity on candidates (`SparseVec::cosine`) after candidate generation.
- Add optional signatures (ternary LSH / multi-probe) for further speedups.
- Integrate with EmbrFS:
  - Index `Engram.codebook` and/or hierarchical sub-engrams.