# pondus
Opinionated AI model benchmark aggregator.
[](https://crates.io/crates/pondus)
[](https://opensource.org/licenses/MIT)
## What it does
Aggregates AI model benchmark data from 8 trusted sources into a unified JSON schema. Designed for AI agents (Claude Code, etc.) to consume programmatically. Caches results for 24h to avoid rate limiting.
## Sources
| Artificial Analysis | REST API | Speed, quality, pricing metrics |
| LM Arena (LMSYS) | Community JSON | ELO ratings from human preferences |
| SWE-bench | GitHub JSON | Code generation resolve rates |
| SWE-rebench | agent-browser scrape | Code generation resolve rates (rebench variant) |
| Aider | GitHub YAML | Polyglot coding benchmark pass rates |
| LiveBench | HuggingFace | Multi-domain benchmark scores |
| Terminal-Bench | HuggingFace | Terminal/CLI task completion |
| SEAL | agent-browser scrape | Scale AI evaluation scores |
> **Note:** Sources marked "agent-browser scrape" require the `agent-browser` CLI installed separately. All other sources work out of the box.
## Installation
```
cargo install pondus
```
## Usage
```bash
pondus rank # rank all models (default command)
pondus # same as `pondus rank`
pondus rank --top 10 # top 10 only
pondus check claude-opus-4.6 # check one model across all sources
pondus compare gpt-5.2 claude-opus-4.6 # head-to-head comparison
pondus sources # show source status
pondus refresh # clear cache and re-fetch
```
### Global Flags
| `--format json|table|markdown` | Output format (default: json) |
| `--refresh` | Bypass cache for this run |
## Configuration
Config location: `~/.config/pondus/config.toml`
```toml
[cache]
ttl_hours = 24
[alias]
path = "models.toml" # relative to config dir, or absolute path
[sources.artificial_analysis]
api_key = "your-key" # optional, for AA source
[sources.agent_browser]
path = "agent-browser" # path to agent-browser CLI
```
## Model Aliases
Different benchmarks use different naming conventions. `models.toml` maps canonical model names to source-specific variants:
```toml
[claude-opus-4_6]
canonical = "claude-opus-4.6"
aliases = [
"Claude Opus 4.6",
"claude-opus-4-6",
"anthropic/claude-opus-4.6",
"Opus 4.6",
]
```
When you run `pondus check opus-4.6`, pondus resolves the alias to the canonical name and matches it across all sources regardless of how each source names the model. PRs welcome to add new models.
## Output Format
Default JSON output:
```json
{
"timestamp": "2026-02-27T10:30:00Z",
"query": { "query_type": "rank" },
"sources": [
{
"source": "arena",
"status": "ok",
"scores": [
{ "model": "gpt-5.2", "rank": 1, "metrics": { "elo": 1350 } }
]
}
]
}
```
## Contributing
- **Add a model:** Add an entry to `models.toml` with canonical name and known aliases
- **Add a source:** Implement the `Source` trait in `src/sources/`
PRs welcome.
## License
MIT
## Sister Tools
Part of a family of AI-augmented CLI tools:
- [lustro](https://github.com/terry-li-hm/lustro) — AI news aggregator
- [consilium](https://github.com/terry-li-hm/consilium) — Multi-model deliberation
- [hexis](https://github.com/hexis-framework/hexis) — Meta-cognitive framework