pub trait Eval<Output>where
Output: for<'a> Deserialize<'a> + Serialize + Clone + Send + Sync,
Self: Sized + Send + Sync + 'static,{
// Required method
fn eval(
&self,
input: String,
) -> impl Future<Output = EvalOutcome<Output>> + Send;
// Provided method
fn eval_batch(
&self,
input: Vec<String>,
concurrency_limit: usize,
) -> impl Future<Output = Vec<EvalOutcome<Output>>> + Send { ... }
}Available on crate feature
experimental only.Expand description
A trait to encode evaluators - types that can be used to test LLM outputs against criteria. Evaluators come in all shapes and sizes, and additionally may themselves use LLMs (although there are many heuristics you can use that don’t). There are three possible states that an LLM can result in:
- Pass (the output passed all criteria)
- Fail (the output failed one or all criteria)
- Invalid (the output was unable to be retrieved due to an external failure like an API call fail)
Required Methods§
Provided Methods§
Sourcefn eval_batch(
&self,
input: Vec<String>,
concurrency_limit: usize,
) -> impl Future<Output = Vec<EvalOutcome<Output>>> + Send
fn eval_batch( &self, input: Vec<String>, concurrency_limit: usize, ) -> impl Future<Output = Vec<EvalOutcome<Output>>> + Send
Send a bunch of inputs to be evaluated all in one call. You can set the concurrency limit to help alleviate issues with model provider API limits, as sending requests too quickly may result in throttling or temporary request refusal.
Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.