Module diff

Expand description

Nine-axis behavioral differ, bootstrap CI, and report renderers.

See README.md “The nine axes” for the list and SPEC §Replay for what “diverges” means in this context.

Usage:

let pricing = diff::cost::Pricing::new();
let report = diff::compute_report(&baseline, &candidate, &pricing, Some(42));
println!("{}", report.to_terminal());

Re-exports§

pub use alignment::DivergenceKind;
pub use alignment::FirstDivergence;
pub use axes::Axis;
pub use axes::AxisStat;
pub use axes::Severity;
pub use bootstrap::paired_ci;
pub use bootstrap::CiResult;
pub use drill_down::PairAxisScore;
pub use drill_down::PairDrilldown;
pub use recommendations::ActionKind;
pub use recommendations::Recommendation;
pub use recommendations::RecommendationSeverity;
pub use report::DiffReport;

Modules§

alignment: First-divergence detection over paired chat responses.
axes: Shared types for the nine-axis behavioral diff.
bootstrap: Bootstrap resampling for paired statistics.
conformance: Axis 9: schema / format conformance rate.
cost: Axis 6: cost (input+output tokens × per-model pricing).
drill_down: Per-pair drill-down: surfaces which specific turn in the paired trace set drove each aggregate axis regression.
embedder: Pluggable embedding backend for the semantic axis.
judge: Axis 8: LLM-judge (user-supplied rubric).
latency: Axis 5: end-to-end latency (SPEC §4.2 chat_response.latency_ms).
reasoning: Axis 7: reasoning depth — thinking tokens + self-correction markers.
recommendations: Prescriptive fix recommendations derived from a DiffReport.
report: Rendering of DiffReport to markdown and terminal.
safety: Axis 3: safety — the rate at which the model abstained from completing the user’s request.
semantic: Axis 1: final-output semantic similarity.
trajectory: Axis 2: tool-call trajectory divergence.
verbosity: Axis 4: verbosity (output-token count) from chat_response.usage.output_tokens.

Functions§

compute_report: Compute a DiffReport from a baseline and candidate trace.
extract_response_pairs: Extract (baseline_response, candidate_response) pairs by pairing the i-th chat_response in baseline with the i-th in candidate.

Module diff

Module diff Copy item path

Re-exports§

Modules§

Functions§

Module diff