Expand description
Plan replay engine. See docs/03-plan-replay-design.md.
Public surface:
replay::replay— the deterministic entry point.- Types in
types—PlanInput,PlanResult, etc. error::PlanError— every error this crate can return.
Hard invariants enforced by tests/replay.rs and tests/bootstrap.rs:
- Determinism: same
(historical_rows, proposed_config, seed)→ bit-identical JSON output. - Bootstrap CI coverage: synthetic universes with known truth land inside the 95% CI ~95% of the time.
- Conservative cost math: missing pricing → request counted as unchanged (never fabricate savings).
Re-exports§
pub use apply::apply_plan;pub use apply::ApplyError;pub use apply::InMemoryPlanStore;pub use apply::PlanStore;pub use error::PlanError;pub use pricing::catalog_pricing_table;pub use pricing::catalog_pricing_table_at;pub use quality::score_quality;pub use quality::stratified_sample;pub use quality::JudgeProvider;pub use quality::JudgeVerdict;pub use quality::MockJudge;pub use quality::QualityConfig;pub use quality::QualityError;pub use quality::QualityResult;pub use quality::RiskBand;pub use quality::SampleScore;pub use replay::replay;pub use replay::replay_with_quality;pub use types::Aggregates;pub use types::CacheProjection;pub use types::ConfidenceIntervals;pub use types::L2Projection;pub use types::L2SweepResult;pub use types::ModelPricing;pub use types::PerRouteBreakdown;pub use types::PlanConfig;pub use types::PlanInput;pub use types::PlanResult;pub use types::PricingTable;pub use types::ProposedRoute;pub use types::RequestLog;pub use types::RouteAction;pub use types::RouteConditions;
Modules§
- apply
- Plan apply path — mark a Plan as applied + emit a
plan.appliedaudit row. - bootstrap
- Deterministic bootstrap confidence intervals. Uses ChaCha8 (seeded)
so the same
(samples, seed, iterations)triple always produces the same(lo, hi). ChaCha8 is the same RNG familyranditself uses forStdRngon most platforms; the explicit choice here is so we don’t inherit a futureranddefault-change as a determinism break. - cache_
projection - L1 cache hit projection. v1 ships exact-match keyed on
(provider, model, input_tokens, tag)— a coarse stand-in for the “normalized request shape” hash described indocs/03-plan-replay-design.md§6.1. - cost
- Cost projection for a single replayed request. The math is deliberately
minimal so the determinism contract is easy to audit: same inputs in,
same
f64out. - error
- Errors surfaced by the Plan replay engine.
- l2_
projection - L2 (semantic) cache projection. Mirrors the contract of the production
[
tt_cache::l2::L2Cache] — cosine-similarity search over not-yet-expired entries that share the same(provider, model)key — but operates on a historical request stream so the Plan can answer: “what hit rate would the proposed(threshold, ttl)config have yielded?” - pricing
- Build a replay
PricingTablefrom the shared versioned catalog (tt_shared::pricing). - quality
- Tier 3 LLM-judge quality scoring for Plan projections.
- replay
- The deterministic replay loop. No async, no I/O, no clock reads — pure
function of the input. Determinism is the contract enforced by
tests/replay.rs. - routing
- Route matching. Given a request and a slice of proposed routes (sorted by priority descending), return the first enabled route whose conditions all match. Pure function — no state, no side effects.
- types
- Plan input + output types. All Serde-serializable so the determinism
contract (
same input -> bit-identical JSON output) can be asserted in tests and snapshotted viainsta.