yoyo-agent 0.1.7

# Cost Tracking

yoyo estimates the cost of each interaction so you can monitor spending.

## Per-turn costs

After each response, you'll see a compact token summary:

```
  ↳ 3.2s · 1523→842 tokens · $0.0234
```

With `--verbose` (or `-v`), you get the full breakdown:

```
  tokens: 1523 in / 842 out  [cache: 1000 read, 500 write]  (session: 4200 in / 2100 out)  cost: $0.0234  total: $0.0567  ⏱ 3.2s
```

- **cost** — estimated cost for this turn
- **total** — estimated cumulative cost for the session

## Quick cost check

Use `/cost` for a quick overview with a breakdown by cost category:

```
  Session cost: $0.0567
    4.2k in / 2.1k out
    cache: 1.0k read / 500 write

    Breakdown:
      input:       $0.0126
      output:      $0.0315
      cache write: $0.0031
      cache read:  $0.0005
```

## Detailed breakdown

Use `/tokens` to see a full breakdown including cache usage:

```
  Session totals:
    input:       120.5k tokens
    output:      45.2k tokens
    cache read:  30.0k tokens
    cache write: 15.0k tokens
    est. cost:   $0.892
```

## Supported models

Costs are estimated based on published pricing for all major providers:

### Anthropic

| Model | Input | Cache Write | Cache Read | Output |
|-------|-------|-------------|------------|--------|
| Opus 4.5/4.6 | $5/MTok | $6.25/MTok | $0.50/MTok | $25/MTok |
| Opus 4/4.1 | $15/MTok | $18.75/MTok | $1.50/MTok | $75/MTok |
| Sonnet | $3/MTok | $3.75/MTok | $0.30/MTok | $15/MTok |
| Haiku 4.5 | $1/MTok | $1.25/MTok | $0.10/MTok | $5/MTok |
| Haiku 3.5 | $0.80/MTok | $1/MTok | $0.08/MTok | $4/MTok |

### OpenAI

| Model | Input | Output |
|-------|-------|--------|
| GPT-4.1 | $2/MTok | $8/MTok |
| GPT-4.1 Mini | $0.40/MTok | $1.60/MTok |
| GPT-4.1 Nano | $0.10/MTok | $0.40/MTok |
| GPT-4o | $2.50/MTok | $10/MTok |
| GPT-4o Mini | $0.15/MTok | $0.60/MTok |
| o3 | $2/MTok | $8/MTok |
| o3-mini | $1.10/MTok | $4.40/MTok |
| o4-mini | $1.10/MTok | $4.40/MTok |

### Google

| Model | Input | Output |
|-------|-------|--------|
| Gemini 2.5 Pro | $1.25/MTok | $10/MTok |
| Gemini 2.5 Flash | $0.15/MTok | $0.60/MTok |
| Gemini 2.0 Flash | $0.10/MTok | $0.40/MTok |

### DeepSeek

| Model | Input | Output |
|-------|-------|--------|
| DeepSeek Chat/V3 | $0.27/MTok | $1.10/MTok |
| DeepSeek Reasoner/R1 | $0.55/MTok | $2.19/MTok |

### Mistral

| Model | Input | Output |
|-------|-------|--------|
| Mistral Large | $2/MTok | $6/MTok |
| Mistral Small | $0.10/MTok | $0.30/MTok |
| Codestral | $0.30/MTok | $0.90/MTok |

### xAI (Grok)

| Model | Input | Output |
|-------|-------|--------|
| Grok 3 | $3/MTok | $15/MTok |
| Grok 3 Mini | $0.30/MTok | $0.50/MTok |
| Grok 2 | $2/MTok | $10/MTok |

### Groq (hosted models)

| Model | Input | Output |
|-------|-------|--------|
| Llama 3.3 70B | $0.59/MTok | $0.79/MTok |
| Llama 3.1 8B | $0.05/MTok | $0.08/MTok |
| Mixtral 8x7B | $0.24/MTok | $0.24/MTok |
| Gemma2 9B | $0.20/MTok | $0.20/MTok |

MTok = million tokens.

### OpenRouter

Models accessed through OpenRouter (e.g., `anthropic/claude-sonnet-4-20250514`) are automatically recognized — the provider prefix is stripped before matching.

## Limitations

- Cost estimates are approximate — actual billing may differ slightly
- For unrecognized models, no cost estimate is shown
- Cache read/write costs only apply to Anthropic models; other providers show zero cache costs
- Pricing may change — check your provider's pricing page for the latest rates

## Keeping costs down

- Use smaller models (Haiku, Sonnet, GPT-4.1 Mini, Gemini Flash) for simple tasks
- Use `/compact` to reduce context size (fewer input tokens per turn)
- Use single-prompt mode (`-p`) for quick questions to avoid accumulating context
- Turn off extended thinking for routine tasks