semantic-memory 0.5.1

Local-first hybrid semantic search (SQLite + FTS5 + usearch 2.25) with bitemporal truth and typed receipts
Documentation
# Phase 3 - Vector codec abstraction and artifact profiles

## Objective

Decouple vector representation from memory truth. This creates the seam where SQ8, raw f32, and TurboQuant can coexist without corrupting authority semantics.

## P1 issues covered

- F-008: avoid overloading `embedding_q8`
- F-009: make search receipts codec-aware
- TurboQuant precondition: codec metadata and profile digest

## Design target

Introduce a `VectorCodec` trait or equivalent service boundary.

Suggested trait:

```rust
pub trait VectorCodec {
    type Code;

    fn profile(&self) -> VectorCodecProfileV1;
    fn encode(&self, raw: &[f32]) -> Result<Self::Code>;
    fn score_inner_product(&self, code: &Self::Code, query: &[f32]) -> Result<f32>;
    fn score_l2(&self, code: &Self::Code, query: &[f32]) -> Result<f32>;
    fn decode_approx(&self, code: &Self::Code) -> Result<Option<Vec<f32>>>;
}
```

Names can differ. The semantic boundary cannot.

## Required codec families

Implement at least:

```text
RawF32Codec - exact/reference path
Sq8Codec    - wrapper around existing semantic-memory quantize module
```

Reserve but do not implement yet:

```text
TurboQuantCodec - Phase 4
PolarQuantCodec - optional
QjlSketchCodec  - optional
```

## Vector codec profile

Suggested struct:

```rust
pub struct VectorCodecProfileV1 {
    pub profile_id: String,
    pub codec_family: String,
    pub codec_version: String,
    pub dim: u32,
    pub bits: Option<u8>,
    pub projections: Option<u32>,
    pub seed: Option<u64>,
    pub score_semantics: String,
    pub normalization: String,
    pub canonical_json: String,
    pub profile_digest: String,
}
```

Profile digest must be deterministic.

## Vector artifact table/schema

Add storage only if the current DB architecture can absorb it cleanly. Otherwise add in-memory structs and TODO migration doc.

Preferred minimal table:

```sql
CREATE TABLE IF NOT EXISTS vector_codec_profiles (
  profile_id TEXT PRIMARY KEY,
  codec_family TEXT NOT NULL,
  codec_version TEXT NOT NULL,
  dim INTEGER NOT NULL,
  bits INTEGER,
  projections INTEGER,
  seed TEXT,
  score_semantics TEXT NOT NULL,
  normalization TEXT NOT NULL,
  canonical_json TEXT NOT NULL,
  profile_digest TEXT NOT NULL UNIQUE,
  created_at TEXT NOT NULL
);

CREATE TABLE IF NOT EXISTS vector_artifacts (
  artifact_id TEXT PRIMARY KEY,
  source_kind TEXT NOT NULL,
  source_id TEXT NOT NULL,
  source_embedding_digest TEXT NOT NULL,
  profile_id TEXT NOT NULL,
  encoded_bytes BLOB NOT NULL,
  encoded_digest TEXT NOT NULL,
  generation INTEGER NOT NULL,
  created_at TEXT NOT NULL,
  FOREIGN KEY(profile_id) REFERENCES vector_codec_profiles(profile_id)
);
```

## Profile mismatch behavior

Scoring must reject mismatches:

- wrong dim
- wrong codec family
- wrong profile digest
- incompatible score semantics
- malformed code bytes
- non-finite query values

## Tests to add

1. RawF32 profile digest is deterministic.
2. SQ8 profile digest is deterministic.
3. Same profile + same vector = same encoded digest.
4. Wrong dimension rejects scoring.
5. Wrong profile rejects scoring.
6. Malformed code rejects scoring.
7. Raw reference score matches existing cosine/dot expectations.
8. SQ8 path works through the same codec interface.

## Acceptance criteria

- Codec profiles exist and are deterministic.
- Existing SQ8 behavior is not overloaded or reinterpreted.
- Raw f32 reference codec is available.
- Profile mismatch fails loudly.
- Search receipt can reference codec profile even before TurboQuant.

## Codex prompt

```text
Run Phase 3: vector codec abstraction and artifact profiles.

Add a VectorCodec boundary with RawF32 and SQ8/current-quantize implementations. Add deterministic VectorCodecProfileV1 metadata and encoded digest helpers. Do not implement TurboQuant yet. Do not rename embedding_q8 to mean TurboQuant. Keep raw/reference scoring available for conformance.

If a storage migration is safe, add vector_codec_profiles and vector_artifacts tables. If not, add typed structs and a migration plan doc, but still implement deterministic profile digests and codec mismatch rejection.

Add tests for deterministic profile digests, same-vector same-code digest, wrong-dimension/profile rejection, malformed code rejection, and raw/SQ8 scoring through the same interface.

Run targeted codec tests plus cargo fmt/check. Report exact commands and remaining risks.
```

---