Skip to main content

Module ab_testing

Module ab_testing 

Source
Expand description

A/B Testing Framework for Embedding Models (v0.3.0)

Production-ready framework for comparing embedding model variants with:

  • ModelVariant: encapsulates a model with metadata and metrics collection
  • ABTestConfig: configures traffic splits, metric targets, and test duration
  • ABTestRunner: routes inference requests between variants and records outcomes
  • ABTestAnalyzer: statistical significance testing (Welch’s t-test, Mann-Whitney U)
  • ABTestReport: generates detailed comparison reports

§Design

The framework is model-agnostic: any function from an input key to a Vec<f64> embedding qualifies as a “model variant”. This keeps the A/B framework decoupled from specific GNN or KGE implementations.

§Example

use oxirs_embed::ab_testing::{ABTestConfig, ABTestRunner, ModelVariant};

let control = ModelVariant::new("transe-v1", |_key: &str| vec![0.0f64; 64]);
let treatment = ModelVariant::new("transe-v2", |_key: &str| vec![0.1f64; 64]);

let config = ABTestConfig::default();
let mut runner = ABTestRunner::new(config, control, treatment)?;

// Simulate requests
for i in 0..200 {
    let key = format!("entity:{i}");
    let (embedding, variant_name) = runner.route(&key)?;
    // Record a business metric (e.g., link prediction hit@10)
    runner.record_metric(&variant_name, 0.85)?;
}

let report = runner.analyze()?;
println!("{}", report.summary());

Structs§

ABTestAnalyzer
Statistical significance testing for A/B experiment results.
ABTestConfig
Configuration for an A/B test.
ABTestReport
Full A/B test analysis report.
ABTestRunner
Routes inference requests between two variants and records metrics.
MannWhitneyResult
Result of Mann-Whitney U test.
ModelVariant
A named embedding model variant for A/B testing.
Observation
A single metric observation from a model variant.
TTestResult
Result of Welch’s two-sample t-test.
VariantStats
Summary statistics for one variant’s metric observations.

Enums§

OptimizeMetric
Metric to optimize / compare between variants.
Winner
Which variant won the A/B test.