Skip to main content

EvalRunner

swink_agent_eval

Struct EvalRunner

pub struct EvalRunner { /* private fields */ }

Expand description

Orchestrates evaluation: runs agents, captures trajectories, and scores results. Default: sequential, num_runs=1, no cache, no cancellation.

Implementations§

impl EvalRunner

pub fn new(registry: EvaluatorRegistry) -> Self

Create a runner with a custom evaluator registry.

pub fn with_defaults() -> Self

Create a runner pre-loaded with built-in evaluators.

pub fn with_parallelism(self, n: usize) -> Self

Maximum number of concurrent case executions (FR-036).

§Panics

Panics if n == 0.

pub fn with_num_runs(self, n: u32) -> Self

Repeat judge-side scoring n times per case (FR-037 / Q2).

§Panics

Panics if n == 0.

pub fn with_cache(self, store: Arc<dyn EvaluationDataStore>) -> Self

Attach a pluggable EvaluationDataStore for cached invocations (FR-038).

pub fn with_cancellation(self, token: CancellationToken) -> Self

Attach a CancellationToken honored at every await point (FR-040).

pub fn with_initial_session_file(self, path: PathBuf) -> Self

Load the given JSON file as an initial SessionState before each case (FR-039 / R-023). Missing / malformed files surface as EvalError::InvalidCase — never a panic.

pub fn with_telemetry(self, telemetry: Arc<EvalsTelemetry>) -> Self

Attach an EvalsTelemetry (spec 043 US7 / FR-035). When present, Self::run_set emits the three-level span tree swink.eval.run_set → swink.eval.case → swink.eval.evaluator.

pub fn agent_invocation_count(&self) -> usize

Number of times an agent was actually invoked (cache miss count).

pub fn reset_agent_invocation_count(&self)

Reset the agent-invocation counter to zero.

pub async fn run_case( &self, case: &EvalCase, factory: &dyn AgentFactory, ) -> Result<EvalCaseResult, EvalError>

Run a single eval case and return the scored result.

pub async fn run_set( &self, eval_set: &EvalSet, factory: &dyn AgentFactory, ) -> Result<EvalSetResult, EvalError>

Run an entire eval set and return aggregated results.

Auto Trait Implementations§

impl Freeze for EvalRunner

impl !RefUnwindSafe for EvalRunner

impl Send for EvalRunner

impl Sync for EvalRunner

impl Unpin for EvalRunner

impl UnsafeUnpin for EvalRunner

impl !UnwindSafe for EvalRunner

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> FutureExt for T

fn with_context(self, otel_cx: Context) -> WithContext<Self>

Attaches the provided Context to this type, returning a WithContext wrapper. Read more

fn with_current_context(self) -> WithContext<Self>

Attaches the current Context to this type, returning a WithContext wrapper. Read more

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more

impl<T> Same for T

type Output = T

Should always be Self

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more