ai-summary-1.2.0 is not a library.

ai-summary

Web search & summarization CLI for AI coding agents. Reduces token consumption by compressing web content through local LLMs or Gemini before feeding it to Claude Code (or any LLM-powered tool).

How It Works

┌──────────────┐     ┌──────────┐     ┌───────────┐     ┌──────────────┐
│  Web Search  │────▶│  Fetch   │────▶│ Readability│────▶│  LLM Summary │
│ Gemini/DDG/  │     │  Pages   │     │ Extract    │     │ Local/Remote │
│   Brave      │     │          │     │            │     │              │
└──────────────┘     └──────────┘     └───────────┘     └──────────────┘
                                                               │
                                                               ▼
                                                     ┌──────────────────┐
                                                     │ Compressed output │
                                                     │ (60-98% smaller) │
                                                     └──────────────────┘

Instead of sending raw 50K+ page content to Claude, ai-summary returns a focused 1-4K summary — saving tokens and money.

Features

Search + Summarize — Gemini (Google Search grounding), DuckDuckGo, or Brave Search
Fetch + Summarize — Fetch any URL, extract article content via readability, summarize with LLM
Stdin Summarize — Pipe any text through for compression
Fast Compress — No-LLM text extraction for instant compression (used by hooks)
JS-heavy Pages — agent-browser and Cloudflare Browser Rendering support
Pipe-friendly — cat urls.txt | ai-summary fetch, --json output, standard exit codes
Claude Code Hooks — PostToolUse hooks auto-compress WebFetch, WebSearch, and test output
Rich Statistics — Time-period breakdown, ROI tracking, per-mode analysis
Multiple LLM Backends — opencode (free), oMLX (local), OpenAI, Groq, DeepSeek, or any OpenAI-compatible API

Installation

# From crates.io
cargo install ai-summary

# Or build from source
git clone https://github.com/sunoj/ai-summary.git
cd ai-summary
cargo build --release
cp target/release/ai-summary ~/.local/bin/

Pre-built binaries for macOS (Apple Silicon / Intel) and Linux are available on GitHub Releases.

Requirements: Rust 1.70+, a summarization backend (opencode CLI recommended — free).

Quick Start

# Generate config file
ai-summary config

# Search (uses Gemini CLI > Gemini API > DDG > Brave)
ai-summary "what is the latest Rust version"

# Fetch URLs and summarize
ai-summary fetch https://example.com/article -p "what are the key points"

# Fetch from stdin
cat urls.txt | ai-summary fetch -p "summarize each"

# Compress piped text (no LLM, instant)
echo "large text..." | ai-summary compress -m 4000

# JSON output (for scripting)
ai-summary --json "query" | jq '.summary'

# Check token savings (with time periods and ROI)
ai-summary stats

Configuration

Config file: ~/.ai-summary/config.toml (auto-created with ai-summary config)

# LLM backend — local oMLX (recommended for Apple Silicon)
api_url = "http://127.0.0.1:8000"
api_key = ""  # Leave empty for oMLX auto-detection
model = "Qwen3.5-9B-MLX-4bit"

# Search provider — Gemini + Google Search (recommended)
gemini_api_key = ""  # Free: https://aistudio.google.com/apikey
gemini_model = "gemini-2.0-flash"

# Brave Search fallback (free: https://brave.com/search/api/)
brave_api_key = ""

max_pages = 3
max_page_chars = 4000
max_summary_tokens = 1024

Search priority: Gemini CLI > Gemini API > DuckDuckGo > Brave

Environment variables: GEMINI_API_KEY, BRAVE_API_KEY, AI_SUMMARY_API_URL, AI_SUMMARY_API_KEY, AI_SUMMARY_MODEL.

Claude Code Integration

PostToolUse Hooks

Three hooks auto-compress Claude Code tool responses:

Hook	Matcher	Behavior
`postwebfetch.sh`	WebFetch	First fetch: compress (skips if <10% savings). Second fetch of same URL: pass through
`postwebsearch.sh`	WebSearch	Compress long search results (skips if <10% savings)
`postbash.sh`	Bash	Summarize passing test output with structured totals (cargo test, npm test, pytest, etc.)

Add to ~/.claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "WebFetch",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postwebfetch.sh" }]
      },
      {
        "matcher": "WebSearch",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postwebsearch.sh" }]
      },
      {
        "matcher": "Bash",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postbash.sh" }]
      }
    ]
  }
}

Requires jq and ai-summary in PATH.

Note: PostToolUse hooks only run for successful commands (exit 0). Failed test output passes through directly to Claude.

Subcommands

Command	Description
`ai-summary <query>`	Search the web and summarize results
`ai-summary fetch <urls> -p <prompt>`	Fetch URLs and summarize
`ai-summary sum <prompt>`	Summarize stdin text via LLM
`ai-summary compress -m <chars>`	Fast text compression (no LLM)
`ai-summary crawl <url> -p <prompt>`	Crawl website via Cloudflare Browser Rendering
`ai-summary stats`	Show token savings statistics
`ai-summary reset-stats`	Reset statistics
`ai-summary config`	Show or create config file

Flags

Flag	Description
`--deep`	Fetch more pages (5 instead of 3)
`--raw`	Skip summarization, return raw content
`--json`	Structured JSON output (for scripting/piping)
`--browser`	Use agent-browser for JS-heavy pages
`--cf`	Use Cloudflare Browser Rendering
`--api-url`	Override API endpoint
`--api-key`	Override API key
`--model`	Override model name

Exit Codes

Code	Meaning
0	Success
1	User error (bad args, no input)
2	API/network error (no results, fetch failed)

Statistics

ai-summary Token Savings
════════════════════════════════════════════════════════════

Metric               Today   7 days  30 days   All Time
────────────────────────────────────────────────────────────
Queries                  8       17       17         21
Pages fetched            8       17       17         17
Tokens saved         14.4K    29.2K    29.2K      31.3K
Cost saved           $0.04    $0.09    $0.09      $0.09
Compression            84%      84%      84%        76%
────────────────────────────────────────────────────────────

ROI: $0.011 LLM cost -> $0.09 Claude cost saved (9x return)

By Mode (hooks: 0, manual: 17)
────────────────────────────────────────────────────────────────────
  #  Mode            Count    Saved   Avg%     Time  Impact
────────────────────────────────────────────────────────────────────
  1.  gemini-cli        7    21.0K   85.6%    65.1s  ██████████
  2.  fetch             6     3.6K   70.9%    16.0s  █░░░░░░░░░
  3.  gemini            1     3.3K   93.3%     4.1s  █░░░░░░░░░
  4.  stdin             3     1.3K   77.3%    12.0s  ░░░░░░░░░░
────────────────────────────────────────────────────────────────────

License

MIT

ai-summary 1.2.0