Skip to main content

Module diff

Module diff 

Source
Expand description

Nine-axis behavioral differ, bootstrap CI, and report renderers.

See README.md “The nine axes” for the list and SPEC §Replay for what “diverges” means in this context.

Usage:

let pricing = diff::cost::Pricing::new();
let report = diff::compute_report(&baseline, &candidate, &pricing, Some(42));
println!("{}", report.to_terminal());

Re-exports§

pub use alignment::DivergenceKind;
pub use alignment::FirstDivergence;
pub use axes::Axis;
pub use axes::AxisStat;
pub use axes::Severity;
pub use bootstrap::paired_ci;
pub use bootstrap::CiResult;
pub use drill_down::PairAxisScore;
pub use drill_down::PairDrilldown;
pub use recommendations::ActionKind;
pub use recommendations::Recommendation;
pub use recommendations::RecommendationSeverity;
pub use report::DiffReport;

Modules§

alignment
First-divergence detection over paired chat responses.
axes
Shared types for the nine-axis behavioral diff.
bootstrap
Bootstrap resampling for paired statistics.
conformance
Axis 9: schema / format conformance rate.
cost
Axis 6: cost (input+output tokens × per-model pricing).
drill_down
Per-pair drill-down: surfaces which specific turn in the paired trace set drove each aggregate axis regression.
embedder
Pluggable embedding backend for the semantic axis.
judge
Axis 8: LLM-judge (user-supplied rubric).
latency
Axis 5: end-to-end latency (SPEC §4.2 chat_response.latency_ms).
reasoning
Axis 7: reasoning depth — thinking tokens + self-correction markers.
recommendations
Prescriptive fix recommendations derived from a DiffReport.
report
Rendering of DiffReport to markdown and terminal.
safety
Axis 3: safety — the rate at which the model abstained from completing the user’s request.
semantic
Axis 1: final-output semantic similarity.
trajectory
Axis 2: tool-call trajectory divergence.
verbosity
Axis 4: verbosity (output-token count) from chat_response.usage.output_tokens.

Functions§

compute_report
Compute a DiffReport from a baseline and candidate trace.
extract_response_pairs
Extract (baseline_response, candidate_response) pairs by pairing the i-th chat_response in baseline with the i-th in candidate.