Expand description
Runtime-toggleable performance optimizations.
Per coding_agent_session_search-yvv7r (rollback env vars for ifr7/lxn5)
and coding_agent_session_search-waijq (CASS_F16_PRECONVERT for mng4).
§Design contract
Each runtime optimization is gated by a CASS_<FEATURE> env var read once
at startup and cached in a OnceLock<bool>. Operators flipping a toggle
must restart cass; per-query toggling is intentionally not supported (the
contract is “operator flips, restarts, measures”).
§Truthful values
Bool semantics (case-insensitive):
- Unset,
1,true,on,yes→ optimization ENABLED (default). 0,false,off,no→ optimization DISABLED (rollback path).- Any other value → log a
tracing::warn!and treat as ENABLED.
§Health surface
cass health --json exposes runtime_optimizations: { simd_dot, parallel_search, preconvert_f16, config_source } so operators and monitoring can confirm the
flip took effect.
§Why not std::env::var
Per AGENTS.md, ALL configuration must load via dotenvy::var() — it
respects the project’s .env file. std::env::var would skip .env
entries and produce inconsistent behavior between dev and production.
Structs§
- Runtime
Optimizations Snapshot - Snapshot of the cached toggle values. This is the canonical shape exposed
in
cass health --jsonunderruntime_optimizations.
Enums§
- Config
Source - Source of the runtime-optimization configuration as surfaced in
cass health --jsonunderruntime_optimizations.config_source.
Functions§
- init_
from_ env - Force-resolve all toggles. Called once at startup before any query path
reads them. Subsequent calls are no-ops thanks to
OnceLock::set. - parallel_
search_ enabled - Whether parallel rayon-driven vector search is enabled.
- preconvert_
f16_ enabled - Whether f16→f32 preconversion at vector-load time is enabled.
- simd_
dot_ enabled - Whether SIMD dot product is enabled. Lazily initializes from env on first
read if
init_from_env()has not been called yet.