Skip to main content

Crate ferrum_bench_core

Crate ferrum_bench_core 

Source
Expand description

ferrum-bench-core — canonical schema, metric aggregation, and variance reporting for ferrum’s bench and bench-serve commands.

Locked by docs/bench/PLAYBOOK.md § 7. Do not invent variants; producers and consumers (bench, bench-serve, compare-commits, visualizer, dashboards) all build against the types here.

§Quick map

  • BenchReport — top-level: one bench cell, aggregated across n_repeats
  • Scenario — closed-loop / open-loop / shared-prefix / cli
  • MetricSet — p50/p75/p95/p99 of one latency metric
  • ScalarStats{mean, stddev, ci95_hw} (stats module)
  • Env + EnvHash — apples-to-apples cell identity ([env] module)
  • ProfileEvent — locked structured profile JSONL envelope (profile module)
  • compute_metrics — the one aggregator both bench CLIs call
  • arrivals module — Poisson inter-arrival times for open-loop

§Determinism notes

  • JSON keys are emitted in struct field-declaration order; field order is part of the locked schema and should not change.
  • BTreeMap (not HashMap) for any dynamic key-value bag.
  • CI95 fields are suppressed when n_repeats < 3 (degenerate).

Re-exports§

pub use env::Env;
pub use env::EnvHash;
pub use profile::configure_global_profile;
pub use profile::flush_global_profile;
pub use profile::global_profile;
pub use profile::parse_profile_event_value;
pub use profile::parse_profile_jsonl_str;
pub use profile::profile_fields_from_json;
pub use profile::ProfileEvent;
pub use profile::ProfileJsonlWriter;
pub use profile::ProfileMetadata;
pub use profile::ProfileSinkConfig;
pub use stats::ci95_half_width;
pub use stats::percentile;
pub use stats::student_t_975;
pub use stats::PercentileStats;
pub use stats::ScalarStats;

Modules§

arrivals
Poisson arrival-time generation for open-loop benchmarking.
env
Bench environment snapshot — hardware + software + config — and the SHA-256 env_hash used by compare-commits.sh and similar to filter “apples-to-apples” cells.
profile
Structured profile event schema shared by benchmark runners and consumers.
report
Markdown report generation for bench cells (PLAYBOOK § 2.4 + § 7).
stats
Statistical aggregates used in BenchReport: percentile (linear interpolation), ScalarStats (mean / stddev / CI95), and Student-t critical values for small-sample CI.
trace
Chrome Trace Event JSON emission — PLAYBOOK § Phase 1.5.

Structs§

BenchReport
One bench cell — n_repeats independent runs aggregated.
MetricSet
Four percentile points for a single latency metric. Each point is a ScalarStats aggregate across n_repeats runs.
QualityIssueCounts
RequestRecord
One request’s measurements (input to compute_metrics).
RunRecord
One independent run of the bench workload.
Slo
SLO thresholds applied when computing goodput. All in milliseconds.
TokenLengthStats

Enums§

OutputTokenCountSource
Scenario
Locked enum of bench scenarios — see docs/bench/PLAYBOOK.md § 2.

Functions§

compute_metrics
Aggregate n_repeats independent runs into one BenchReport.