Expand description
inputx-scoring — probability-native candidate scoring primitive.
§The schema
Every candidate carries two log-space scalars and a typed match classification:
score(W | i) = log_prior(W) + log_likelihood(i | W)This is the Bayesian decomposition P(W|i) ∝ P(i|W) · P(W) rendered
in log space — addition replaces multiplication, comparison stays
monotone, and the two factors can be assigned independently by
producers (dictionaries, n-grams, fuzzy matchers, …) without
coordinating a single global formula.
Both terms are i32 Q4 fixed-point. One log unit = Q4 (= 16)
integer steps. The Q4 choice trades resolution for headroom: scores
fit comfortably in i32 even for very rare or very common words
while staying precise enough that ranking-relevant gaps (~0.0625 in
log space) survive quantization.
§Public surface
Source— which engine produced the candidateMatchType— how the typed input maps to the candidateCandidate— a (word, log_prior, log_likelihood, match_type, source) tuplescore—log_prior + log_likelihood, the sort keyQ4— log-to-integer scale (= 16)
Scoring policy lives in the consumer (IME engine cement); this crate provides only the schema and the additive sort key.
Structs§
- Candidate
- One scored candidate. Word lifetime is
'aso consumers can pass borrowed references through the merge pipeline; the merger clones only the survivors.
Enums§
- Match
Type - Classification of how the typed input
imaps to a candidateW. Carried alongside the (log_prior, log_likelihood) pair so the downstream merger / probe / UI can render context without re-deriving the match shape. - Source
- Engine that produced a candidate. Numeric representation is stable
across versions so it can cross the FFI boundary as
u8without translation. Mirrorsinputx_core::composite::merge::Source.
Constants§
- Q4
- Fixed-point scale for log-space scalars.
Q4 = 16means every integer step is 1/16 of a log unit (≈ 0.0625). At this resolution,i32covers a dynamic range of ~ ±67 million log units — far more than any realistic candidate score needs.
Functions§
- derive_
log_ likelihood - Q4 log-likelihood derived from a match-type classification.
- log_
prior_ from_ freq - Q4 log of natural priors derived from a corpus frequency. Returns
Q4 · ln(1 + freq)rounded to the nearest integer. Zero-freq candidates map to0; freq-1 to ~11; freq-1000 to ~110. - score
- Bayesian sort key:
log_prior + log_likelihood. The single source of truth for ranking. Higher = better.