servo-fetch-cli-0.7.0 is not a library.
servo-fetch-cli
A browser engine in a binary — fetch, render, and extract web content powered by Servo.
For programmatic use in Rust, see the servo-fetch library crate.
Install
Pre-built binaries (recommended)
|
Cargo
Usage
Extract content
Batch fetch
Screenshots
JavaScript execution
CSS selector extraction
Crawl a site
SPA / dynamic content
MCP server
Options
| Flag | Description |
|---|---|
--json |
Structured JSON output (NDJSON for multiple URLs) |
--screenshot <FILE> |
Save PNG screenshot |
--full-page |
Capture full scrollable page (requires --screenshot) |
--js <EXPR> |
Execute JavaScript and print result |
--selector <CSS> |
Extract specific section by CSS selector |
--raw html|text |
Raw HTML or plain text output |
-t, --timeout <SECS> |
Page load timeout in seconds (default: 30) |
--settle <MS> |
Extra wait after load event in ms (default: 0, max: 10000) |
--user-agent <UA> |
Override the User-Agent string |
-v, --verbose |
Increase log verbosity (-v info, -vv debug, -vvv trace) |
-q, --quiet |
Suppress all logs except errors |
JSON output
--json returns an object with these fields:
| Field | Type | Description |
|---|---|---|
title |
string | Page title |
content |
string | Raw HTML extracted by Readability |
text_content |
string | Readable text (Markdown) |
byline |
string | Author or byline (omitted if not detected) |
excerpt |
string | Short excerpt or description (omitted if not detected) |
lang |
string | Document language (omitted if not detected) |
url |
string | Canonical URL (omitted if not detected) |
Crawl subcommand
servo-fetch crawl <URL> follows same-site links using BFS. Respects robots.txt (RFC 9309) with a minimum 500ms interval.
| Flag | Description |
|---|---|
--limit <N> |
Maximum pages to crawl (default: 50) |
--max-depth <N> |
Maximum link depth (default: 3) |
--include <GLOB> |
URL path patterns to include |
--exclude <GLOB> |
URL path patterns to exclude |
--json |
Output content as JSON per page |
--selector <CSS> |
Extract specific section per page |
--user-agent <UA> |
Override the User-Agent string |
Logging
Diagnostic messages go to stderr; stdout is reserved for data output so pipes stay clean.
RUST_LOG="servo_fetch=debug" RUST_LOG="servo_fetch=trace,servo=debug"
RUST_LOG uses tracing-subscriber's directive syntax and always wins over CLI flags.
Environment Variables
| Variable | Description |
|---|---|
SERVO_FETCH_USER_AGENT |
Default User-Agent string (overridden by --user-agent) |
RUST_LOG |
Fine-grained log filter (overrides -v/-q) |
MCP Server
Built-in Model Context Protocol server over stdio or Streamable HTTP.
Streamable HTTP: servo-fetch mcp --port 8080
| Parameter | Type | Description |
|---|---|---|
url |
string | URL to fetch (http/https only) |
format |
string? | markdown (default), json, html, text, or accessibility_tree |
max_length |
number? | Max characters to return (default 5000) |
start_index |
number? | Character offset for pagination (default 0) |
timeout |
number? | Page load timeout in seconds (default 30) |
settle_ms |
number? | Extra wait in ms after load event (default 0, max 10000) |
selector |
string? | CSS selector to extract a specific section |
| Parameter | Type | Description |
|---|---|---|
urls |
string[] | URLs to fetch (http/https only, max 20) |
format |
string? | markdown (default) or json |
max_length |
number? | Max characters per URL result (default 5000) |
timeout |
number? | Page load timeout in seconds per URL (default 30) |
settle_ms |
number? | Extra wait in ms after load event (default 0, max 10000) |
selector |
string? | CSS selector to extract a specific section |
| Parameter | Type | Description |
|---|---|---|
url |
string | Starting URL (http/https only) |
limit |
number? | Maximum pages to crawl (default 20, max 500) |
max_depth |
number? | Maximum link depth from seed (default 3, max 10) |
format |
string? | markdown (default) or json |
include_glob |
string[]? | URL path patterns to include |
exclude_glob |
string[]? | URL path patterns to exclude |
max_length |
number? | Max characters per page result (default 5000) |
timeout |
number? | Page load timeout in seconds per page (default 30) |
settle_ms |
number? | Extra wait in ms after load event (default 0, max 10000) |
selector |
string? | CSS selector to extract a specific section per page |
| Parameter | Type | Description |
|---|---|---|
url |
string | URL to capture (http/https only) |
full_page |
boolean? | Capture the full scrollable page (default false) |
timeout |
number? | Page load timeout in seconds (default 30) |
settle_ms |
number? | Extra wait in ms after load event (default 0, max 10000) |
| Parameter | Type | Description |
|---|---|---|
url |
string | URL to load before executing JS |
expression |
string | JavaScript expression to evaluate |
timeout |
number? | Page load timeout in seconds (default 30) |
settle_ms |
number? | Extra wait in ms after load event (default 0, max 10000) |