Module search_plan

Expand description

Quantization-aware Search Plan

This module provides a formal runtime plan for vector search that separates policy (what to optimize for) from mechanism (how to execute).

§Architecture

SearchRequest + SLA → Planner → SearchPlan → Executor → Results
                         ↑
                   Cost Model + Statistics

Policy (what to optimize):

Mechanism (how to execute):

The planner uses measured per-stage costs:

Minimize expected latency subject to:

Uses bandit-like adaptation based on recent query statistics.

CostModel: Cost model parameters (calibrated per hardware).
DatasetStats: Statistics about the dataset for planning.
PipelineStage: A single stage in the search pipeline.
PlanExecutor: Plan executor that runs a search plan.
SearchPlan: The search plan: a complete specification for executing a search.
SearchPlanner: Search planner that generates optimal plans.
SearchSLA: Service Level Agreement for search.
StageCosts: Per-stage cost measurements.