Skip to main content

Module cost_model

Module cost_model 

Source
Expand description

§Cost Model and Per-Query Budgets (Task 1)

This module provides an explicit cost model with enforceable per-query budgets to stabilize p99 latency under load while preserving recall targets.

§Architecture

The cost model is “bytes-moved first” and enforces runtime limits on:

  • RAM bytes scanned for candidate generation
  • SSD random reads allowed in hot path (ideally 0)
  • SSD sequential bytes allowed for rerank batching
  • CPU cycles spent in routing/scan

§Math/Algorithm

Constrained optimization: minimize E[bytes scanned] subject to:

  • P(recall@k ≥ ρ) ≥ 1−δ
  • p99 ≤ T

Convert latency SLA into budgets:

  • Bytes ≤ BW_eff · T
  • RandomIO ≤ ⌊T / L_io⌋

§Usage

use sochdb_vector::cost_model::{QueryBudget, CostTracker, AdmissionController};

// Define budget for query class
let budget = QueryBudget::new("high_recall")
    .ram_bytes(16 * 1024 * 1024)  // 16 MB RAM scan
    .ssd_random_reads(0)           // No random reads in hot path
    .ssd_sequential_bytes(4 * 1024 * 1024)  // 4 MB sequential for rerank
    .cpu_cycles(1_000_000_000);    // ~1B cycles

// Track costs during query execution
let mut tracker = CostTracker::new(budget);
tracker.add_ram_bytes(1024);
if tracker.is_exhausted() {
    // Return best-known results under budget
}

Structs§

AdmissionController
Admission controller for backpressure under concurrency
AdmissionMetrics
Metrics from admission controller
AdmissionTicket
Handle returned when a query is admitted
CostSummary
Summary of cost consumption
CostTracker
Tracks resource consumption during query execution
CostUtilization
Resource utilization ratios
HardwareProfile
Hardware characteristics for SLA-to-budget conversion
QueryBudget
Per-query budget limits derived from SLA targets
QueryClassRegistry
Registry of query classes with their budgets

Enums§

BudgetExhaustionReason
Reason why budget was exhausted