Skip to main content

OneharnessProvider

Struct OneharnessProvider 

Source
pub struct OneharnessProvider { /* private fields */ }
Expand description

The default Provider: runs each prompt on a harness through the oneharness CLI.

Wires four real oneharness features that ship in v0.2.0:

  • --system <skill instructions> — the skill becomes a real system prompt on the underlying harness (e.g. --append-system-prompt for claude-code), instead of being inlined into the user message.
  • --resume <session> — multi-turn respond calls thread the previous session_id so the harness sees a continuing conversation (and keeps its tool state, files, etc.) instead of being re-prompted with a stringified transcript. Used only for harnesses that report supports_resume in the registry (claude-code, opencode, cursor today); other harnesses fall back to the inline-transcript path.
  • Normalized usage (input_tokens, output_tokens, cost_usd) — surfaced on every turn so cross-model cost reporting is portable.
  • Normalized failure_kind (auth, rate_limit, model_not_found, …) — classified provider errors so the CLI can distinguish a broken environment from a broken skill.

Evals and the simulated user always run on the configured judge_harness, independent of the harness under test, so the evaluator does not drift with the matrix.

Implementations§

Source§

impl OneharnessProvider

Source

pub fn new(config: &OneharnessConfig) -> Self

Build a provider from its configuration.

Trait Implementations§

Source§

impl Provider for OneharnessProvider

Source§

fn respond( &self, platform: &str, model: &str, skill: &SkillRef<'_>, messages: &[Message], session: Option<&str>, ) -> Result<AssistantTurn>

Run one assistant/skill turn given the conversation so far. session, when Some, is a handle returned by a previous respond call on this run that the provider may use to continue the same harness session (e.g. via oneharness run --resume); providers that don’t support continuation should ignore it. Read more
Source§

fn simulate_user( &self, model: &str, persona: &str, messages: &[Message], ) -> Result<UserTurn>

Produce one simulated-user turn. Read more
Source§

fn judge( &self, model: &str, query: &JudgeQuery<'_>, messages: &[Message], ) -> Result<JudgeVerdict>

Score a criterion against the conversation. Read more
Source§

fn supports_resume(&self, platform: &str) -> bool

True iff respond on platform will faithfully continue a prior session when given its session_id. The default is false; providers that support resume override this so the runner knows to thread the session id through.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.