Expand description
Composite corpus-quality metric.
Combines four sub-scores into one tuner objective, each in [0, 1]:
-
EVR — variance explained by the projection. Pulled from
SphereQLPipeline::explained_variance_ratio. Already in[0, 1]. -
Bridge coherence — delegates to
crate::quality_metric::BridgeCoherence, so the sub-score is bit-identical to the standalone metric, including its neutral-when-no-Genuine floor (BRIDGE_COHERENCE_NEUTRAL). The floor matters here: underBridgeConfig::min_evr_for_classification, low-EVR corpora have zeroGenuinebridges, and a rawgenuine/totalwould pin this 0.30-weighted term at 0 — freezing the self-tune objective on exactly the bulk corpora it exists for. -
Curvature health — corpus mean of
1 - clamp(|mean_excess_z|, 0, 1)across the per-category curvature signatures returned bycurvature_analysis. Categories whose centroids sit close to the corpus-wide spherical-excess regime score near 1; outliers drag the score toward 0. -
Category balance — Shannon entropy of category sizes, normalized to
[0, 1]againstlog2(n_categories). Tracks how evenly concepts are distributed across categories.
Default weights (sum = 1):
quality = 0.30 * EVR
+ 0.30 * bridge_coherence
+ 0.20 * curvature_health
+ 0.20 * category_balanceWeights are configurable via CorpusQualityWeights; the metric
normalizes by their sum, so they do not need to total 1. The metric
is deterministic for a given pipeline.
Structs§
- Corpus
Quality - Composite metric: a single tuner-friendly score that fuses EVR, bridge coherence, curvature health, and category balance.
- Corpus
Quality Breakdown - Per-axis sub-scores for one
CorpusQuality::scorecall. Returned viaCorpusQuality::last_breakdownso tuner reports and dashboards can attribute the composite to its components. - Corpus
Quality Weights - Weights for the four sub-scores. Must be finite, non-negative, and
not all zero. They do NOT need to sum to 1 —
CorpusQualitynormalizes by their sum at score time.