Skip to main content

EvaluationReport

agent_sdk_eval::report

Struct EvaluationReport

pub struct EvaluationReport {
    pub evaluation_id: EvaluationId,
    pub scope: EvaluationScope,
    pub comparison: ComparisonDesign,
    pub verdict: EvaluationVerdict,
    pub score: Option<String>,
    pub confidence: EvaluationConfidence,
    pub judgments: Vec<EvaluatorJudgment>,
    pub metric_deltas: Vec<EvaluationMetricDelta>,
    pub evidence_refs: Vec<EntityRef>,
    pub usage: EvaluationUsage,
    pub redacted_summary: String,
    pub limitations: Vec<String>,
}

Expand description

Top-level report returned by an evaluator.

Fields§

§evaluation_id: EvaluationId

Stable evaluation id.

§scope: EvaluationScope

Scope this report evaluates.

§comparison: ComparisonDesign

Comparison design actually used.

§verdict: EvaluationVerdict

Top-level verdict.

§score: Option<String>

Optional top-level score.

§confidence: EvaluationConfidence

Top-level confidence.

§judgments: Vec<EvaluatorJudgment>

Per-subject or per-criterion judgments.

§metric_deltas: Vec<EvaluationMetricDelta>

Metric deltas for measured evaluations.

§evidence_refs: Vec<EntityRef>

Evidence refs used by this report.

§usage: EvaluationUsage

Usage captured during evaluation.

§redacted_summary: String

Bounded report summary.

§limitations: Vec<String>

Limitations or validation notes.

Implementations§

impl EvaluationReport

pub fn new( evaluation_id: EvaluationId, scope: EvaluationScope, comparison: ComparisonDesign, verdict: EvaluationVerdict, confidence: EvaluationConfidence, redacted_summary: impl Into<String>, ) -> Self

Creates a report with no metric deltas.

pub fn with_usage(self, usage: EvaluationUsage) -> Self

Returns this report with usage attached.

pub fn with_judgment(self, judgment: EvaluatorJudgment) -> Self

Returns this report with one judgment appended.

pub fn with_metric_delta(self, metric_delta: EvaluationMetricDelta) -> Self

Returns this report with one metric delta appended.

pub fn validate_confidence_contract(&self) -> Result<(), AgentError>

Validates that measured confidence is backed by comparison evidence and metric deltas.

pub fn validate_confidence_contract_for_request( &self, request: &EvaluationRequest, ) -> Result<(), AgentError>

Validates measured confidence against request-owned metric deltas.

Trait Implementations§

impl Clone for EvaluationReport

fn clone(&self) -> EvaluationReport

Returns a duplicate of the value. Read more

1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Debug for EvaluationReport

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<'de> Deserialize<'de> for EvaluationReport

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

impl Eq for EvaluationReport

impl PartialEq for EvaluationReport

fn eq(&self, other: &EvaluationReport) -> bool

Tests for self and other values to be equal, and is used by ==.

1.0.0 (const: unstable) · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.

impl Serialize for EvaluationReport

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

impl StructuralPartialEq for EvaluationReport

Auto Trait Implementations§

impl Freeze for EvaluationReport

impl RefUnwindSafe for EvaluationReport

impl Send for EvaluationReport

impl Sync for EvaluationReport

impl Unpin for EvaluationReport

impl UnsafeUnpin for EvaluationReport

impl UnwindSafe for EvaluationReport

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> Same for T

type Output = T

Should always be Self

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.