pub struct BenchmarkScore {
pub name: String,
pub score: f64,
pub harness: Option<String>,
pub source_url: Option<String>,
pub measured_at: Option<String>,
}Expand description
A score on a public benchmark from a published source (model card,
paper, leaderboard). The schema is deliberately permissive — no enum
of benchmark names — so the catalog can carry whichever benchmarks
the upstream provider chose to publish, and new ones can be added
without a code change. Scores are stored on a 0.0–1.0 scale (e.g.
73.5% accuracy → 0.735) so they compare cleanly across benchmarks
and so routing_ext::apply_benchmark_priors can consume them
directly when wired in later.
Fields§
§name: StringBenchmark name as published (e.g., “MMLU-Pro”, “GPQA-Diamond”, “SWE-bench-Verified”, “HumanEval”, “MATH”).
score: f64Score on a 0.0–1.0 scale.
harness: Option<String>Evaluation harness or setup label (e.g., “5-shot”, “0-shot CoT”, “agentic”, “pass@1”). Optional but strongly recommended — the same benchmark name can mean different things under different harnesses.
source_url: Option<String>Where the score came from (model card URL, paper, leaderboard snapshot). Empty when the source is the upstream provider’s announcement and a stable URL is not yet known.
measured_at: Option<String>ISO 8601 date of the score snapshot (e.g., “2025-08-12”). Lets downstream code judge how stale a number is.
Trait Implementations§
Source§impl Clone for BenchmarkScore
impl Clone for BenchmarkScore
Source§fn clone(&self) -> BenchmarkScore
fn clone(&self) -> BenchmarkScore
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl Debug for BenchmarkScore
impl Debug for BenchmarkScore
Source§impl<'de> Deserialize<'de> for BenchmarkScore
impl<'de> Deserialize<'de> for BenchmarkScore
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for BenchmarkScore
impl RefUnwindSafe for BenchmarkScore
impl Send for BenchmarkScore
impl Sync for BenchmarkScore
impl Unpin for BenchmarkScore
impl UnsafeUnpin for BenchmarkScore
impl UnwindSafe for BenchmarkScore
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more