Skip to main content

Crate tt_plan_core

Crate tt_plan_core 

Source
Expand description

Plan replay engine. See docs/03-plan-replay-design.md.

Public surface:

Hard invariants enforced by tests/replay.rs and tests/bootstrap.rs:

  1. Determinism: same (historical_rows, proposed_config, seed) → bit-identical JSON output.
  2. Bootstrap CI coverage: synthetic universes with known truth land inside the 95% CI ~95% of the time.
  3. Conservative cost math: missing pricing → request counted as unchanged (never fabricate savings).

Re-exports§

pub use apply::apply_plan;
pub use apply::ApplyError;
pub use apply::InMemoryPlanStore;
pub use apply::PlanStore;
pub use error::PlanError;
pub use pricing::catalog_pricing_table;
pub use pricing::catalog_pricing_table_at;
pub use quality::score_quality;
pub use quality::stratified_sample;
pub use quality::JudgeProvider;
pub use quality::JudgeVerdict;
pub use quality::MockJudge;
pub use quality::QualityConfig;
pub use quality::QualityError;
pub use quality::QualityResult;
pub use quality::RiskBand;
pub use quality::SampleScore;
pub use replay::replay;
pub use replay::replay_with_quality;
pub use types::Aggregates;
pub use types::CacheProjection;
pub use types::ConfidenceIntervals;
pub use types::L2Projection;
pub use types::L2SweepResult;
pub use types::ModelPricing;
pub use types::PerRouteBreakdown;
pub use types::PlanConfig;
pub use types::PlanInput;
pub use types::PlanResult;
pub use types::PricingTable;
pub use types::ProposedRoute;
pub use types::RequestLog;
pub use types::RouteAction;
pub use types::RouteConditions;

Modules§

apply
Plan apply path — mark a Plan as applied + emit a plan.applied audit row.
bootstrap
Deterministic bootstrap confidence intervals. Uses ChaCha8 (seeded) so the same (samples, seed, iterations) triple always produces the same (lo, hi). ChaCha8 is the same RNG family rand itself uses for StdRng on most platforms; the explicit choice here is so we don’t inherit a future rand default-change as a determinism break.
cache_projection
L1 cache hit projection. v1 ships exact-match keyed on (provider, model, input_tokens, tag) — a coarse stand-in for the “normalized request shape” hash described in docs/03-plan-replay-design.md §6.1.
cost
Cost projection for a single replayed request. The math is deliberately minimal so the determinism contract is easy to audit: same inputs in, same f64 out.
error
Errors surfaced by the Plan replay engine.
l2_projection
L2 (semantic) cache projection. Mirrors the contract of the production [tt_cache::l2::L2Cache] — cosine-similarity search over not-yet-expired entries that share the same (provider, model) key — but operates on a historical request stream so the Plan can answer: “what hit rate would the proposed (threshold, ttl) config have yielded?”
pricing
Build a replay PricingTable from the shared versioned catalog (tt_shared::pricing).
quality
Tier 3 LLM-judge quality scoring for Plan projections.
replay
The deterministic replay loop. No async, no I/O, no clock reads — pure function of the input. Determinism is the contract enforced by tests/replay.rs.
routing
Route matching. Given a request and a slice of proposed routes (sorted by priority descending), return the first enabled route whose conditions all match. Pure function — no state, no side effects.
types
Plan input + output types. All Serde-serializable so the determinism contract (same input -> bit-identical JSON output) can be asserted in tests and snapshotted via insta.