Expand description
Benchmarking system for quality improvements Benchmarking system for GraphRAG quality improvements
This module provides comprehensive benchmarking tools to measure:
- Accuracy improvements from new features
- Token usage and cost reduction
- Latency and throughput
- Quality metrics (F1, Exact Match, BLEU)
Structsยง
- Benchmark
Config - Configuration for benchmark runs
- Benchmark
Dataset - Dataset for benchmarking
- Benchmark
Query - A single query with ground truth for evaluation
- Benchmark
Runner - Main benchmarking coordinator
- Benchmark
Summary - Aggregate benchmark results across multiple queries
- Latency
Metrics - Latency breakdown by pipeline stage
- Quality
Metrics - Quality metrics for answer evaluation
- Query
Benchmark - Benchmark results for a single query
- Token
Metrics - Token usage tracking