Crate elinor

source
Expand description

Elinor (Evaluation library in information retrieval) is a library for evaluating information retrieval (IR) systems. It provides a comprehensive set of tools and metrics tailored for IR engineers, offering an intuitive and easy-to-use interface.

§Key features

  • IR-specific design: Elinor is tailored specifically for evaluating IR systems, with an intuitive interface designed for IR engineers. It offers a streamlined workflow that simplifies common IR evaluation tasks.
  • Comprehensive evaluation metrics: Elinor supports a wide range of key evaluation metrics, such as Precision, MAP, MRR, and nDCG. The supported metrics are available in Metric. The evaluation results are validated against trec_eval to ensure accuracy and reliability.
  • In-depth statistical testing: Elinor includes several statistical tests, such as Student’s t-test or Randomized Tukey HSD test, to verify the generalizability of results. Not only p-values but also other statistics, such as effect sizes and confidence intervals, are provided for thorough reporting. See the statistical_tests module for more details.

§Basic usage in evaluating several metrics

You first need to prepare gold relevance judgments and predicted relevance scores through GoldRelStore and PredRelStore, respectively. You can build these instances using GoldRelStoreBuilder and PredRelStoreBuilder.

Then, you can evaluate the predicted relevance scores using the evaluate function and the specified metric. The available metrics are defined in the Metric enum.

An example is shown below:

use approx::assert_abs_diff_eq;
use elinor::{GoldRelStoreBuilder, PredRelStoreBuilder, Metric};

// Prepare gold relevance scores.
// In binary-relevance metrics, 0 means non-relevant and the others mean relevant.
let mut b = GoldRelStoreBuilder::new();
b.add_score("q_1", "d_1", 1)?;
b.add_score("q_1", "d_2", 0)?;
b.add_score("q_1", "d_3", 2)?;
b.add_score("q_2", "d_2", 2)?;
b.add_score("q_2", "d_4", 1)?;
let gold_rels = b.build();

// Prepare predicted relevance scores.
let mut b = PredRelStoreBuilder::new();
b.add_score("q_1", "d_1", 0.5.into())?;
b.add_score("q_1", "d_2", 0.4.into())?;
b.add_score("q_1", "d_3", 0.3.into())?;
b.add_score("q_2", "d_4", 0.1.into())?;
b.add_score("q_2", "d_1", 0.2.into())?;
b.add_score("q_2", "d_3", 0.3.into())?;
let pred_rels = b.build();

// Evaluate Precision@3.
let evaluated = elinor::evaluate(&gold_rels, &pred_rels, Metric::Precision { k: 3 })?;
assert_abs_diff_eq!(evaluated.mean_score(), 0.5000, epsilon = 1e-4);

// Evaluate MAP, where all documents are considered via k=0.
let evaluated = elinor::evaluate(&gold_rels, &pred_rels, Metric::AP { k: 0 })?;
assert_abs_diff_eq!(evaluated.mean_score(), 0.5000, epsilon = 1e-4);

// Evaluate MRR, where the metric is specified via a string representation.
let evaluated = elinor::evaluate(&gold_rels, &pred_rels, "rr".parse()?)?;
assert_abs_diff_eq!(evaluated.mean_score(), 0.6667, epsilon = 1e-4);

// Evaluate nDCG@3, where the metric is specified via a string representation.
let evaluated = elinor::evaluate(&gold_rels, &pred_rels, "ndcg@3".parse()?)?;
assert_abs_diff_eq!(evaluated.mean_score(), 0.4751, epsilon = 1e-4);

§Relevance stores from HashMap

GoldRelStore and PredRelStore can also be instantiated from HashMaps. The following mapping structure is expected:

query_id => { doc_id => score }

It allows you to prepare data in JSON or other formats via Serde. If you use Serde, enable the serde feature in the Cargo.toml:

[dependencies]
elinor = { version = "*", features = ["serde"] }

An example to instantiate relevance stores from JSON is shown below:

use std::collections::HashMap;
use elinor::{GoldRelStore, GoldScore, PredRelStore, PredScore};

let gold_rels_data = r#"
{
    "q_1": {
        "d_1": 1,
        "d_2": 0,
        "d_3": 2
    },
    "q_2": {
        "d_2": 2,
        "d_4": 1
    }
}"#;

let pred_rels_data = r#"
{
    "q_1": {
        "d_1": 0.5,
        "d_2": 0.4,
        "d_3": 0.3
    },
    "q_2": {
        "d_3": 0.3,
        "d_1": 0.2,
        "d_4": 0.1
    }
}"#;

let gold_rels_map: HashMap<String, HashMap<String, GoldScore>> =
    serde_json::from_str(gold_rels_data)?;
let pred_rels_map: HashMap<String, HashMap<String, PredScore>> =
    serde_json::from_str(pred_rels_data)?;

let gold_rels = GoldRelStore::from_map(gold_rels_map);
let pred_rels = PredRelStore::from_map(pred_rels_map);

assert_eq!(gold_rels.n_queries(), 2);
assert_eq!(gold_rels.n_docs(), 5);
assert_eq!(pred_rels.n_queries(), 2);
assert_eq!(pred_rels.n_docs(), 6);

§Crate features

Re-exports§

Modules§

Structs§

Functions§

Type Aliases§