nab
Token-optimized web fetcher + multilingual ASR + URL watcher. MCP 2025-11-25 compliant. Rust. macOS arm64 first, cross-platform.

nab is a single Rust binary that does three things very well: it fetches any URL as clean markdown (with your real browser cookies and anti-bot evasion), it analyzes any audio or video file with on-device multilingual ASR and speaker diarization, and it watches any URL for changes and pushes notifications when content moves. Everything runs locally. There are no API keys to set up by default. The output is shaped for LLM context windows.
Quick start
Features
| Command | What it does |
|---|---|
nab fetch <url> |
Fetch any URL as clean markdown. HTTP/3, browser cookie injection (Brave / Chrome / Firefox / Safari / Edge / Dia), 1Password auto-login, fingerprint spoofing, 11 site providers, query-focused extraction, token budget. |
nab analyze <video|audio> |
Transcribe and diarize. FluidAudio (Parakeet TDT v3) on Apple Neural Engine, 131x realtime on a 2-hour clip, word-level timestamps, 25 EU languages, optional Qwen3-ASR for zh/ja/ko/vi, optional active reading via MCP sampling. |
nab watch add <url> |
Monitor a URL and push notifications via subscribable MCP resources. RSS for the entire web. Conditional GETs, semantic diff, adaptive backoff. |
nab models fetch <name> |
Persistent install of inference model binaries. Currently fluidaudio. Whisper and sherpa-onnx land in Phase 3. |
nab-mcp |
MCP 2025-11-25 server. stdio + Streamable HTTP. 11 tools, 3 prompts, 2+N resources, structured logging, sampling, roots, elicitation. |
nab::content::ocr |
Apple Vision OCR engine. 15 languages. Apple Neural Engine accelerated. ~10-50 ms per image. macOS only. |
Installation
Homebrew (macOS, recommended)
From crates.io
Requires Rust 1.93 or newer.
Pre-built binary
Or download directly from GitHub Releases:
| Platform | Binary |
|---|---|
| macOS Apple Silicon | nab-aarch64-apple-darwin |
| macOS Intel | nab-x86_64-apple-darwin |
| Linux x86_64 | nab-x86_64-unknown-linux-gnu |
| Linux ARM64 | nab-aarch64-unknown-linux-gnu |
| Windows x64 | nab-x86_64-pc-windows-msvc.exe |
From source
Usage
Fetch
# Basic fetch — auto-detects browser, returns markdown
# Use cookies from a specific browser
# 1Password auto-login (TOTP/MFA supported)
# Google Workspace (Docs, Sheets, Slides) with comments
# Query-focused extraction — only sections relevant to "authentication"
# Output JSON with confidence scores
# Batch fetch with parallelism
Common flags for fetch:
| Flag | Description |
|---|---|
--cookies <browser> |
auto, brave, chrome, firefox, safari, edge, none |
--1password / --op |
1Password credential lookup + auto-login |
--proxy <url> |
HTTP or SOCKS5 proxy |
--format <fmt> |
full (default), compact, json |
--focus <query> |
BM25-lite query-focused extraction |
--max-tokens <n> |
Structure-aware token budget |
--raw-html |
Skip markdown conversion |
--diff |
Show what changed since the last fetch |
--session <name> |
Persistent named session with cookie store |
-X <method> -d <data> |
HTTP method + body |
-o <path> |
Write body to file |
Analyze
nab analyze transcribes audio and video files locally. The default backend on macOS arm64 is FluidAudio, which runs Parakeet TDT v3 on the Apple Neural Engine.
# Download the ASR model (~600 MB, one-time)
# Transcribe a video
# Add speaker diarization (PyAnnote community-1)
# Force a language hint (BCP-47)
# Word-level timestamps
# Active reading: nab uses MCP sampling to look up references mentioned in the audio
# Expose speaker embeddings for matching against hebb's voiceprint database
# Output JSON
Real numbers from a 2 h 09 m English audio file (Karen Hao interview, MacBook Pro M-series):
| Metric | Value |
|---|---|
| Wall time | 59.6 s |
| Realtime factor | 131x |
| FluidAudio mean confidence | 97.18 % |
| Audio extraction (ffmpeg) | ~650x realtime |
| Backend | Platform | Languages | Diarization |
|---|---|---|---|
fluidaudio (default on macOS arm64) |
macOS arm64 | 25 EU languages, +zh/ja/ko/vi via Qwen3-ASR (opt-in) | PyAnnote community-1 |
sherpa-onnx (Phase 3) |
Linux/x86, macOS, Windows | Parakeet ONNX, 25+ langs | sherpa-onnx pyannote-seg-3.0 |
whisper-rs (Phase 3) |
Universal fallback | whisper-large-v3-turbo, 99 langs | none |
Watch
nab watch turns any URL into a subscribable resource. MCP clients receive notifications/resources/updated when the content changes.
Per-watch options:
| Flag | Default | Description |
|---|---|---|
--interval <duration> |
1h | Polling interval (5m, 1h, 24h) |
--selector <css> |
none | CSS selector to scope diff to one element |
--notify-on <kind> |
any |
any, regression, semantic |
--diff <kind> |
semantic |
text, semantic, dom |
The poller uses conditional GETs (If-None-Match, If-Modified-Since), so 304 responses cost effectively nothing. Watches with five consecutive failures auto-mute. Adaptive backoff applies on 429 and 503.
Models
Phase 3 will add whisper and sherpa-onnx subcommands.
MCP integration
nab-mcp is a native Rust MCP server. It runs over stdio (default) or Streamable HTTP. It is fully compliant with MCP protocol version 2025-11-25.
Claude Code / Claude Desktop
Add to your MCP client configuration (~/.config/claude/mcp.json or equivalent):
Continue / Zed / Cursor / Windsurf
Same shape — point command at the nab-mcp binary.
HTTP transport
Bind to localhost by default. Origin checks and MCP-Protocol-Version header validation are enforced per spec.
MCP capabilities
| Capability | Status |
|---|---|
| Tools | 11 tools with structured output schemas, annotations, validation errors |
| Prompts | 3 prompts (fetch-and-extract, multi-page-research, authenticated-fetch, match-speakers-with-hebb) |
| Resources | 2 static + N dynamic watch resources, all subscribable |
| Logging | notifications/message with RFC 5424 levels |
| Sampling | nab calls back to the host LLM for active reading, focus extraction, form auto-fill |
| Roots | roots/list queried for workspace-scoped saves |
| Elicitation | Form mode + URL mode for OAuth/SSO |
| Argument completion | completion/complete for tool args |
| Server icons | Light + dark SVG |
| Transports | stdio + Streamable HTTP (resumable, session-scoped) |
The 11 MCP tools:
| Tool | Description |
|---|---|
fetch |
Fetch URL → markdown, with cookies, focus, token budget, session |
fetch_batch |
Parallel multi-URL fetch with task-augmented async execution |
submit |
Submit a form with CSRF + smart field extraction |
login |
1Password auto-login with TOTP support |
auth_lookup |
Look up 1Password credentials for a URL |
fingerprint |
Generate browser fingerprint profiles |
validate |
Run the validation test suite |
benchmark |
Time URL fetches with stats |
analyze |
Transcribe and diarize audio/video |
watch_create |
Create a URL watch and subscribe |
watch_list / watch_remove |
Manage watches |
Site providers
nab detects URLs for 11 platforms and uses their APIs or structured data instead of scraping HTML.
| Provider | URL pattern | Method |
|---|---|---|
| Twitter / X | x.com/*/status/* |
FxTwitter API |
reddit.com/r/*/comments/* |
JSON API | |
| Hacker News | news.ycombinator.com/item?id=* |
Firebase API |
| GitHub | github.com/*/*/issues/*, */pull/* |
REST API |
| Google Workspace | Docs, Sheets, Slides | Export API + OOXML |
| YouTube | youtube.com/watch?v=*, youtu.be/* |
oEmbed |
| Wikipedia | *.wikipedia.org/wiki/* |
REST API |
| StackOverflow | stackoverflow.com/questions/* |
API |
| Mastodon | */users/*/statuses/* |
ActivityPub |
linkedin.com/posts/* |
oEmbed | |
instagram.com/p/*, */reel/* |
oEmbed |
If no provider matches, nab falls back to standard HTML fetch + markdown conversion.
Architecture
nab is built around a small set of orthogonal subsystems: cmd/ (CLI), bin/mcp_server/ (MCP server), content/ (HTML / PDF / OCR pipeline), analyze/ (ASR + diarization + vision), watch/ (URL monitoring + subscriptions), auth/ (cookies + 1Password + WebAuthn), site/ (per-site providers), and the shared AcceleratedClient (HTTP/3 + connection pool + fingerprint store).
See:
- docs/ARCHITECTURE.md — full module map and data flow
- docs/sovereign-stack.md — how nab composes with hebb to form a local-first multimodal stack
- docs/getting-started.md — new user onboarding
Design notes
The docs/design/ directory tracks recent design proposals:
- analyze-v2.md — multilingual ASR + diarization + vision pipeline
- url-watch-resources.md — URL watch as MCP subscribable resources
- active-reading.md — active reading via MCP sampling
- mcp-spec-closure.md — closing the last MCP 2025-11-25 spec gaps
Companion tools
nab is half of a sovereign multimodal stack. The other half is hebb, a neuroscience-inspired memory MCP server. Composition examples:
nab analyze --diarize --include-embeddings→hebb voice_match→ speakers labeled with namesnab fetch URL→hebb kv_set→ personal sovereign web memorynab watch add URL→hebb kv_set(on update) → time-series of changes to any web page
See docs/sovereign-stack.md for the full composition story.
Configuration
nab requires no configuration files. It uses smart defaults: auto-detected browser cookies, randomized fingerprints, and markdown output.
Persistent state lives in ~/.nab/:
| Path | Purpose |
|---|---|
~/.nab/snapshots/ |
Content snapshots for --diff change detection |
~/.nab/sessions/ |
Saved login sessions |
~/.nab/fingerprint_versions.json |
Cached browser versions (auto-updates every 14 days) |
~/.local/share/nab/watches/ |
URL watch state |
~/.local/share/nab/models/ |
Installed inference model binaries |
Optional plugin configuration at ~/.config/nab/plugins.toml. See docs/getting-started.md for plugin examples.
Environment variables
| Variable | Purpose |
|---|---|
HTTPS_PROXY / https_proxy |
HTTPS proxy URL |
HTTP_PROXY / http_proxy |
HTTP proxy URL |
ALL_PROXY / all_proxy |
Proxy for all protocols |
RUST_LOG |
Logging level (e.g., nab=debug) |
PUSHOVER_USER / PUSHOVER_TOKEN |
Pushover notifications for MFA |
TELEGRAM_BOT_TOKEN / TELEGRAM_CHAT_ID |
Telegram notifications for MFA |
Library usage
use AcceleratedClient;
async
Requirements
- Rust 1.93+ for building from source
- ffmpeg for
analyzeandstreamcommands:brew install ffmpeg - 1Password CLI (optional, for credential integration): see 1Password docs
Contributing
See CONTRIBUTING.md for development setup, code style guidelines, testing instructions, and pull request process.
Responsible use
This tool includes browser cookie extraction and fingerprint spoofing capabilities. They are intended for legitimate use cases — accessing your own authenticated content, automated testing, sites where you have authorization. Use responsibly.
License
MIT — see LICENSE.