Skip to main content

classify_log_ratio_batch

Function classify_log_ratio_batch 

Source
pub fn classify_log_ratio_batch(
    numerator: &ShardedInvertedIndex,
    denominator: &ShardedInvertedIndex,
    records: &[QueryRecord<'_>],
    skip_threshold: Option<f64>,
) -> Result<Vec<LogRatioResult>>
Expand description

Classify a batch of reads using log-ratio (numerator vs denominator).

This is the core log-ratio pipeline:

  1. Validate indices (single-bucket, compatible k/w/salt)
  2. Extract minimizers from all reads
  3. Classify all against numerator (threshold=0.0)
  4. Partition into fast-path (NumHigh) and needs-denom
  5. Classify needs-denom subset against denominator
  6. Compute log10(num_score / denom_score) for each read

Returns one LogRatioResult per input read, sorted by the original query IDs from the input records.

Note: Internally uses sequential query IDs (0..N) for the partition step, then maps back to the original IDs in the results. This avoids panics when caller-provided query IDs are non-sequential (e.g., [100, 200, 300]).