ai-summary 1.2.0

Web search with local LLM summarization — token saver for Claude Code
ai-summary-1.2.0 is not a library.

ai-summary

Web search & summarization CLI for AI coding agents. Reduces token consumption by compressing web content through local LLMs or Gemini before feeding it to Claude Code (or any LLM-powered tool).

How It Works

┌──────────────┐     ┌──────────┐     ┌───────────┐     ┌──────────────┐
│  Web Search  │────▶│  Fetch   │────▶│ Readability│────▶│  LLM Summary │
│ Gemini/DDG/  │     │  Pages   │     │ Extract    │     │ Local/Remote │
│   Brave      │     │          │     │            │     │              │
└──────────────┘     └──────────┘     └───────────┘     └──────────────┘
                                                               │
                                                               ▼
                                                     ┌──────────────────┐
                                                     │ Compressed output │
                                                     │ (60-98% smaller) │
                                                     └──────────────────┘

Instead of sending raw 50K+ page content to Claude, ai-summary returns a focused 1-4K summary — saving tokens and money.

Features

  • Search + Summarize — Gemini (Google Search grounding), DuckDuckGo, or Brave Search
  • Fetch + Summarize — Fetch any URL, extract article content via readability, summarize with LLM
  • Stdin Summarize — Pipe any text through for compression
  • Fast Compress — No-LLM text extraction for instant compression (used by hooks)
  • JS-heavy Pages — agent-browser and Cloudflare Browser Rendering support
  • Pipe-friendlycat urls.txt | ai-summary fetch, --json output, standard exit codes
  • Claude Code Hooks — PostToolUse hooks auto-compress WebFetch, WebSearch, and test output
  • Rich Statistics — Time-period breakdown, ROI tracking, per-mode analysis
  • Multiple LLM Backends — opencode (free), oMLX (local), OpenAI, Groq, DeepSeek, or any OpenAI-compatible API

Installation

# From crates.io
cargo install ai-summary

# Or build from source
git clone https://github.com/sunoj/ai-summary.git
cd ai-summary
cargo build --release
cp target/release/ai-summary ~/.local/bin/

Pre-built binaries for macOS (Apple Silicon / Intel) and Linux are available on GitHub Releases.

Requirements: Rust 1.70+, a summarization backend (opencode CLI recommended — free).

Quick Start

# Generate config file
ai-summary config

# Search (uses Gemini CLI > Gemini API > DDG > Brave)
ai-summary "what is the latest Rust version"

# Fetch URLs and summarize
ai-summary fetch https://example.com/article -p "what are the key points"

# Fetch from stdin
cat urls.txt | ai-summary fetch -p "summarize each"

# Compress piped text (no LLM, instant)
echo "large text..." | ai-summary compress -m 4000

# JSON output (for scripting)
ai-summary --json "query" | jq '.summary'

# Check token savings (with time periods and ROI)
ai-summary stats

Configuration

Config file: ~/.ai-summary/config.toml (auto-created with ai-summary config)

# LLM backend — local oMLX (recommended for Apple Silicon)
api_url = "http://127.0.0.1:8000"
api_key = ""  # Leave empty for oMLX auto-detection
model = "Qwen3.5-9B-MLX-4bit"

# Search provider — Gemini + Google Search (recommended)
gemini_api_key = ""  # Free: https://aistudio.google.com/apikey
gemini_model = "gemini-2.0-flash"

# Brave Search fallback (free: https://brave.com/search/api/)
brave_api_key = ""

max_pages = 3
max_page_chars = 4000
max_summary_tokens = 1024

Search priority: Gemini CLI > Gemini API > DuckDuckGo > Brave

Environment variables: GEMINI_API_KEY, BRAVE_API_KEY, AI_SUMMARY_API_URL, AI_SUMMARY_API_KEY, AI_SUMMARY_MODEL.

Claude Code Integration

PostToolUse Hooks

Three hooks auto-compress Claude Code tool responses:

Hook Matcher Behavior
postwebfetch.sh WebFetch First fetch: compress (skips if <10% savings). Second fetch of same URL: pass through
postwebsearch.sh WebSearch Compress long search results (skips if <10% savings)
postbash.sh Bash Summarize passing test output with structured totals (cargo test, npm test, pytest, etc.)

Add to ~/.claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "WebFetch",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postwebfetch.sh" }]
      },
      {
        "matcher": "WebSearch",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postwebsearch.sh" }]
      },
      {
        "matcher": "Bash",
        "hooks": [{ "type": "command", "command": "/path/to/hooks/postbash.sh" }]
      }
    ]
  }
}

Requires jq and ai-summary in PATH.

Note: PostToolUse hooks only run for successful commands (exit 0). Failed test output passes through directly to Claude.

Subcommands

Command Description
ai-summary <query> Search the web and summarize results
ai-summary fetch <urls> -p <prompt> Fetch URLs and summarize
ai-summary sum <prompt> Summarize stdin text via LLM
ai-summary compress -m <chars> Fast text compression (no LLM)
ai-summary crawl <url> -p <prompt> Crawl website via Cloudflare Browser Rendering
ai-summary stats Show token savings statistics
ai-summary reset-stats Reset statistics
ai-summary config Show or create config file

Flags

Flag Description
--deep Fetch more pages (5 instead of 3)
--raw Skip summarization, return raw content
--json Structured JSON output (for scripting/piping)
--browser Use agent-browser for JS-heavy pages
--cf Use Cloudflare Browser Rendering
--api-url Override API endpoint
--api-key Override API key
--model Override model name

Exit Codes

Code Meaning
0 Success
1 User error (bad args, no input)
2 API/network error (no results, fetch failed)

Statistics

ai-summary Token Savings
════════════════════════════════════════════════════════════

Metric               Today   7 days  30 days   All Time
────────────────────────────────────────────────────────────
Queries                  8       17       17         21
Pages fetched            8       17       17         17
Tokens saved         14.4K    29.2K    29.2K      31.3K
Cost saved           $0.04    $0.09    $0.09      $0.09
Compression            84%      84%      84%        76%
────────────────────────────────────────────────────────────

ROI: $0.011 LLM cost -> $0.09 Claude cost saved (9x return)

By Mode (hooks: 0, manual: 17)
────────────────────────────────────────────────────────────────────
  #  Mode            Count    Saved   Avg%     Time  Impact
────────────────────────────────────────────────────────────────────
  1.  gemini-cli        7    21.0K   85.6%    65.1s  ██████████
  2.  fetch             6     3.6K   70.9%    16.0s  █░░░░░░░░░
  3.  gemini            1     3.3K   93.3%     4.1s  █░░░░░░░░░
  4.  stdin             3     1.3K   77.3%    12.0s  ░░░░░░░░░░
────────────────────────────────────────────────────────────────────

License

MIT