servo-fetch-cli-0.7.0 is not a library.

servo-fetch-cli

A browser engine in a binary — fetch, render, and extract web content powered by Servo.

For programmatic use in Rust, see the servo-fetch library crate.

Install

Pre-built binaries (recommended)

curl -fsSL https://raw.githubusercontent.com/konippi/servo-fetch/main/install.sh | sh

Cargo

cargo binstall servo-fetch-cli   # prebuilt binary via cargo-binstall
cargo install servo-fetch-cli    # build from source (requires Rust 1.86.0+)

Usage

Extract content

servo-fetch "https://example.com"              # Readable Markdown (default)
servo-fetch "https://example.com" --json       # Structured JSON
servo-fetch "https://example.com" --raw html   # Raw rendered HTML
servo-fetch "https://example.com" --raw text   # Plain text (innerText)

Batch fetch

servo-fetch URL1 URL2 URL3                     # Parallel fetch, Markdown output
servo-fetch URL1 URL2 --json                   # Parallel fetch, NDJSON output

Screenshots

servo-fetch "https://example.com" --screenshot page.png
servo-fetch "https://example.com" --screenshot full.png --full-page

JavaScript execution

servo-fetch "https://example.com" --js "document.title"
servo-fetch "https://example.com" --js "document.querySelectorAll('h2').length"

CSS selector extraction

servo-fetch "https://example.com" --selector "article"
servo-fetch "https://example.com" --selector ".main-content" --json

Crawl a site

servo-fetch crawl "https://docs.example.com" --limit 20
servo-fetch crawl "https://docs.example.com" --include "/docs/**" --exclude "/docs/archive/**"
servo-fetch crawl "https://docs.example.com" --json --max-depth 5

SPA / dynamic content

servo-fetch "https://spa.example.com" --settle 3000       # Wait 3s after load for hydration
servo-fetch "https://spa.example.com" -t 60 --settle 5000 # 60s timeout + 5s settle

MCP server

servo-fetch mcp                # stdio transport (for AI agents)
servo-fetch mcp --port 8080    # Streamable HTTP transport

Options

Flag	Description
`--json`	Structured JSON output (NDJSON for multiple URLs)
`--screenshot <FILE>`	Save PNG screenshot
`--full-page`	Capture full scrollable page (requires `--screenshot`)
`--js <EXPR>`	Execute JavaScript and print result
`--selector <CSS>`	Extract specific section by CSS selector
`--raw html\|text`	Raw HTML or plain text output
`-t, --timeout <SECS>`	Page load timeout in seconds (default: 30)
`--settle <MS>`	Extra wait after load event in ms (default: 0, max: 10000)
`--user-agent <UA>`	Override the User-Agent string
`-v, --verbose`	Increase log verbosity (`-v` info, `-vv` debug, `-vvv` trace)
`-q, --quiet`	Suppress all logs except errors

JSON output

--json returns an object with these fields:

Field	Type	Description
`title`	string	Page title
`content`	string	Raw HTML extracted by Readability
`text_content`	string	Readable text (Markdown)
`byline`	string	Author or byline (omitted if not detected)
`excerpt`	string	Short excerpt or description (omitted if not detected)
`lang`	string	Document language (omitted if not detected)
`url`	string	Canonical URL (omitted if not detected)

Crawl subcommand

servo-fetch crawl <URL> follows same-site links using BFS. Respects robots.txt (RFC 9309) with a minimum 500ms interval.

Flag	Description
`--limit <N>`	Maximum pages to crawl (default: 50)
`--max-depth <N>`	Maximum link depth (default: 3)
`--include <GLOB>`	URL path patterns to include
`--exclude <GLOB>`	URL path patterns to exclude
`--json`	Output content as JSON per page
`--selector <CSS>`	Extract specific section per page
`--user-agent <UA>`	Override the User-Agent string

Logging

Diagnostic messages go to stderr; stdout is reserved for data output so pipes stay clean.

servo-fetch -v "https://example.com"                       # info and above
servo-fetch -vv "https://example.com"                      # debug
servo-fetch -vvv "https://example.com"                     # trace
servo-fetch -q "https://example.com"                       # errors only
RUST_LOG="servo_fetch=debug" servo-fetch "https://..."     # fine-grained override
RUST_LOG="servo_fetch=trace,servo=debug" servo-fetch "..." # include Servo internals

RUST_LOG uses tracing-subscriber's directive syntax and always wins over CLI flags.

Environment Variables

Variable	Description
`SERVO_FETCH_USER_AGENT`	Default User-Agent string (overridden by `--user-agent`)
`RUST_LOG`	Fine-grained log filter (overrides `-v`/`-q`)

MCP Server

Built-in Model Context Protocol server over stdio or Streamable HTTP.

{
  "mcpServers": {
    "servo-fetch": {
      "command": "servo-fetch",
      "args": ["mcp"]
    }
  }
}

Streamable HTTP: servo-fetch mcp --port 8080

Parameter	Type	Description
`url`	string	URL to fetch (http/https only)
`format`	string?	`markdown` (default), `json`, `html`, `text`, or `accessibility_tree`
`max_length`	number?	Max characters to return (default 5000)
`start_index`	number?	Character offset for pagination (default 0)
`timeout`	number?	Page load timeout in seconds (default 30)
`settle_ms`	number?	Extra wait in ms after load event (default 0, max 10000)
`selector`	string?	CSS selector to extract a specific section

Parameter	Type	Description
`urls`	string[]	URLs to fetch (http/https only, max 20)
`format`	string?	`markdown` (default) or `json`
`max_length`	number?	Max characters per URL result (default 5000)
`timeout`	number?	Page load timeout in seconds per URL (default 30)
`settle_ms`	number?	Extra wait in ms after load event (default 0, max 10000)
`selector`	string?	CSS selector to extract a specific section

Parameter	Type	Description
`url`	string	Starting URL (http/https only)
`limit`	number?	Maximum pages to crawl (default 20, max 500)
`max_depth`	number?	Maximum link depth from seed (default 3, max 10)
`format`	string?	`markdown` (default) or `json`
`include_glob`	string[]?	URL path patterns to include
`exclude_glob`	string[]?	URL path patterns to exclude
`max_length`	number?	Max characters per page result (default 5000)
`timeout`	number?	Page load timeout in seconds per page (default 30)
`settle_ms`	number?	Extra wait in ms after load event (default 0, max 10000)
`selector`	string?	CSS selector to extract a specific section per page

Parameter	Type	Description
`url`	string	URL to capture (http/https only)
`full_page`	boolean?	Capture the full scrollable page (default false)
`timeout`	number?	Page load timeout in seconds (default 30)
`settle_ms`	number?	Extra wait in ms after load event (default 0, max 10000)

Parameter	Type	Description
`url`	string	URL to load before executing JS
`expression`	string	JavaScript expression to evaluate
`timeout`	number?	Page load timeout in seconds (default 30)
`settle_ms`	number?	Extra wait in ms after load event (default 0, max 10000)

servo-fetch-cli 0.7.0