Module semantic

Expand description

Axis 1: final-output semantic similarity.

Two paths are supported:

TF-IDF cosine (default, no extra deps) — smoothed sklearn-style TF-IDF over the corpus of response texts being compared. Lexical: word-level overlap weighted by token rarity. Fast, deterministic, blind to paraphrase (“yes” vs “I agree” score 0).
Pluggable Embedder — any backend that produces dense vectors per text. Use compute_with_embedder and pass an Embedder impl. Suitable for ONNX runtimes, HF Inference API clients, OpenAI/Cohere embeddings, in-house services, or a PyO3 callback into Python sentence-transformers.

Both paths use the same downstream cosine + paired-CI machinery, so reports from either embedder are directly comparable.

What this axis catches:

Final-text similarity drops (lexical with TF-IDF; paraphrase- robust with a neural Embedder).

What it does NOT catch:

Wrong answer with similar words — TF-IDF cosine measures token overlap; a numeric value flip (“$99 → $9”) barely moves the cosine. The alignment module’s W_ARGS component catches tool-arg value flips; numeric content drift surfaces on the v2.7+ numeric_token_density fingerprint dimension.
Empty-response regressions — empty-vs-empty scores 1.0 (vacuous match). The verbosity axis (axis 4) catches the collapse to empty.
Tone shifts with same content — embeddings only carry semantic meaning; the Judge axis (axis 8) with a tone rubric is the right surface.

Functions§

compute: Compute the semantic-similarity axis using TF-IDF cosine.
compute_with_embedder: Compute the semantic-similarity axis using a caller-supplied dense Embedder.