Skip to main content

Module evaluation

Module evaluation 

Source
Expand description

Evaluation pipeline for content contributors and live peers.

Per CP-015 section 5: Validators download content from Arweave, index it locally in an isolated substrate, run test queries, and measure retrieval quality using precision@10, NDCG@10, and MRR.

Per CP-015 section 13: Validators test live peers by connecting over Tor, sending test queries, and measuring search quality and latency.

Structs§

EvaluationResult
Result of evaluating a single contributor’s content.
PeerQualityMetrics
Quality metrics for a live peer, including latency and availability.
PeerTestResult
Result of testing a live peer.
QualityMetrics
Quality metrics for a contributor’s content, measured against ground truth.

Functions§

composite_score
Composite quality score for a contributor evaluation result.
evaluate_contributor
Evaluate a contributor by downloading their content from Arweave, indexing it in an isolated substrate, and running test queries.
evaluate_local_graph
Evaluate a local graph store directly by running test queries against it.
evaluate_peer
Evaluate a live peer by connecting over Tor, sending test queries, and measuring search quality and latency.
mrr
Mean Reciprocal Rank: the reciprocal of the rank of the first relevant result.
ndcg_at_k
Normalized Discounted Cumulative Gain at K.
peer_composite_score
Composite quality score for a peer test result.
precision_at_k
Precision at K: fraction of the top-K results that are relevant.
update_contributor_ratings
Update contributor ratings based on evaluation results using pairwise OpenSkill comparisons.
update_peer_ratings
Update peer ratings based on peer test results using pairwise OpenSkill comparisons.