Expand description
Semantic policy contract for cass hybrid search.
This module is the single source of truth for all semantic search policy decisions. Downstream beads (asset manifests, backfill scheduler, model acquisition, configuration surfaces, capability reporting) implement against the types and constants defined here rather than guessing or hardcoding their own values.
§Product contract
Ordinary search always works lexically. Semantic quality improves opportunistically: when model files are present, vectors are built in the background and hybrid results are blended in. A missing or broken semantic tier never blocks or degrades lexical search.
§Precedence (lowest to highest)
- Compiled defaults —
SemanticPolicy::compiled_defaults - Persisted config —
~/.config/cass/semantic.toml(planned) - Environment variables —
CASS_SEMANTIC_* - CLI flags —
--semantic-mode,--semantic-budget-mb, etc.
Higher layers override lower layers field-by-field; unset fields inherit.
§Behaviour modes
| Mode | Lexical | Fast-tier semantic | Quality-tier semantic |
|---|---|---|---|
HybridPreferred (default) | always | if available | if model present |
LexicalOnly | always | never | never |
StrictSemantic | always (floor) | required | required |
StrictSemantic is for callers that want hard guarantees about semantic
quality (e.g., bake-off). It is never the default.
§Storage budget
Semantic artifacts are derivative — they can always be rebuilt from the canonical SQLite database. They must never crowd out the DB or the required lexical index.
Eviction order (first to go → last to go):
- HNSW accelerator indices (
.chsw) - Quality-tier vector index (
.fsvi) - Fast-tier vector index
- Downloaded model files
The lexical index and SQLite DB are never evicted.
Structs§
- CliSemantic
Overrides - CLI-level overrides —
Nonemeans “inherit from lower layer”. - Effective
Setting - A single setting with its resolved value and provenance.
- Effective
Settings - Complete effective-settings report for
cass status --json. - Semantic
Asset Manifest - Metadata stored alongside semantic assets to detect invalidation.
- Semantic
Capability Report - JSON-serializable capability snapshot for
cass status --json. - Semantic
Policy - Resolved semantic policy after layering defaults → config → env → CLI.
Enums§
- Budget
Decision - Result of a disk-budget check.
- Invalidation
Action - What happened and what to do about existing semantic assets.
- Model
Download Policy - Whether model downloads are automatic, opt-in, or budget-gated.
- Semantic
Artifact Kind - Categories of semantic artifacts for eviction / budget accounting.
- Semantic
Capability - What semantic quality level is achievable on this machine right now.
- Semantic
Mode - How aggressively cass pursues semantic search.
- Setting
Source - Where a configuration value came from.
Constants§
- CHUNKING_
STRATEGY_ VERSION - Changing the chunking strategy (e.g., max tokens per chunk, overlap) invalidates all existing vectors even if the model is unchanged.
- DEFAULT_
CHUNK_ TIMEOUT_ SECONDS - Maximum wall-clock seconds for a single background work chunk. The scheduler yields after this to re-check budgets and user activity.
- DEFAULT_
FAST_ DIMENSION - Fast-tier embedding dimension (hash embedder).
- DEFAULT_
FAST_ TIER_ EMBEDDER - Default fast-tier embedder name (always available, no model files).
- DEFAULT_
IDLE_ DELAY_ SECONDS - How long (seconds) the scheduler waits after last user activity before starting background work. This prevents contention during interactive search or indexing.
- DEFAULT_
MAX_ BACKFILL_ RSS_ MB - Maximum RSS the backfill worker should target (MB). This is advisory — the embedder ONNX runtime is the main consumer.
- DEFAULT_
MAX_ BACKFILL_ THREADS - Maximum CPU cores the background backfill worker may saturate. On a typical 4-core dev laptop this is ~25 %.
- DEFAULT_
MAX_ REFINEMENT_ DOCS - Maximum documents to refine via quality tier per query.
- DEFAULT_
QUALITY_ DIMENSION - Quality-tier embedding dimension (MiniLM).
- DEFAULT_
QUALITY_ TIER_ EMBEDDER - Default quality-tier embedder name (requires ML model files).
- DEFAULT_
QUALITY_ WEIGHT - Quality-tier score weight when blending (0.0-1.0).
- DEFAULT_
RERANKER - Default reranker name (requires cross-encoder model files).
- DEFAULT_
SEMANTIC_ BUDGET_ MB - Default total semantic disk budget in megabytes.
- EVICTION_
ORDER - Ordered list of semantic artifact categories, first-to-evict first.
- MAX_
MODEL_ SIZE_ MB - Model files are the biggest single cost. Cap per-model.
- MIN_
FREE_ DISK_ MB - Minimum free disk space (MB) that must remain after semantic writes.
- SEMANTIC_
SCHEMA_ VERSION - Semantic schema version. Bump when the vector document ID encoding, quantization format, or normalization changes. A version mismatch forces a full vector rebuild.