Module evaluation

Expand description

Evaluation pipeline for content contributors and live peers.

Per CP-015 section 5: Validators download content from Arweave, index it locally in an isolated substrate, run test queries, and measure retrieval quality using precision@10, NDCG@10, and MRR.

Per CP-015 section 13: Validators test live peers by connecting over Tor, sending test queries, and measuring search quality and latency.

Structs§

EvaluationResult: Result of evaluating a single contributor’s content.
PeerQualityMetrics: Quality metrics for a live peer, including latency and availability.
PeerTestResult: Result of testing a live peer.
QualityMetrics: Quality metrics for a contributor’s content, measured against ground truth.

Functions§

composite_score: Composite quality score for a contributor evaluation result.
evaluate_contributor: Evaluate a contributor by downloading their content from Arweave, indexing it in an isolated substrate, and running test queries.
evaluate_local_graph: Evaluate a local graph store directly by running test queries against it.
evaluate_peer: Evaluate a live peer by connecting over Tor, sending test queries, and measuring search quality and latency.
mrr: Mean Reciprocal Rank: the reciprocal of the rank of the first relevant result.
ndcg_at_k: Normalized Discounted Cumulative Gain at K.
peer_composite_score: Composite quality score for a peer test result.
precision_at_k: Precision at K: fraction of the top-K results that are relevant.
update_contributor_ratings: Update contributor ratings based on evaluation results using pairwise OpenSkill comparisons.
update_peer_ratings: Update peer ratings based on peer test results using pairwise OpenSkill comparisons.

Module evaluation

Module evaluation Copy item path

Structs§

Functions§

Module evaluation