servo-fetch is a single-binary CLI and MCP server that renders web pages using the Servo browser engine. It executes JavaScript, computes CSS layout, captures screenshots, and extracts clean content — all without downloading a browser.
Why servo-fetch?
-
Screenshots without a browser runtime. Servo renders pages to PNG with a software renderer. No GPU, no Xvfb, no Chromium download. Drop the binary into Docker or CI and it just works.
-
Reads JavaScript-heavy pages. SPAs, React, Vue — servo-fetch executes JS via SpiderMonkey and extracts the rendered content. Plain HTTP fetchers return empty HTML; servo-fetch returns the real page.
-
Strips navigation noise using CSS layout. Most tools guess page structure from HTML tags. servo-fetch uses
getComputedStyle()andgetBoundingClientRect()to detect fixed navbars, sidebars, and footers — then removes them before extraction. -
Single binary, zero runtime dependencies.
cargo installor download a prebuilt binary. No Node.js, nonpx playwright install, noapt-get install chromium. -
Built-in MCP server for AI agents. Three tools (
fetch,screenshot,execute_js) over stdio or Streamable HTTP. AI agents can read SPAs and take screenshots without any browser setup.
Install
|
Or download from GitHub Releases, or build from source (requires Rust 1.86.0+):
Usage
# Readable Markdown (default)
# Structured JSON
# Screenshot — rendered to PNG without GPU
# Execute JavaScript in the page context
# Extract a specific section by CSS selector
# PDF text extraction (auto-detected)
Options
| Flag | Description |
|---|---|
--json |
Output as structured JSON |
--screenshot <FILE> |
Save a PNG screenshot |
--js <EXPR> |
Execute JavaScript and print the result |
--selector <CSS> |
Extract a specific section by CSS selector |
--raw <MODE> |
Output raw html or plain text (bypasses Readability) |
-t, --timeout <SECS> |
Page load timeout (default: 30) |
--help |
Show help |
--version |
Show version |
JSON output
--json returns an object with these fields:
| Field | Type | Description |
|---|---|---|
title |
string | Page title |
content |
string | Raw HTML extracted by Readability |
text_content |
string | Readable text (Markdown) |
byline |
string? | Author or byline |
excerpt |
string? | Short excerpt or description |
lang |
string? | Document language (e.g. "en") |
url |
string? | Canonical URL |
Fields marked ? are omitted when not detected.
How it works
- Servo loads the page and executes JavaScript via SpiderMonkey
- CSS is computed with Servo's parallel layout engine —
getComputedStyle()andgetBoundingClientRect()identify page structure - Navbars, sidebars, and footers are stripped using CSS layout data
- Mozilla's Readability algorithm extracts the main content
- Content is output as Markdown, JSON, or PNG
PDF URLs are auto-detected via Content-Type and extracted directly without Servo.
MCP server
servo-fetch includes a built-in MCP server for AI agents with three tools: fetch, screenshot, and execute_js.
# stdio transport (default)
# Streamable HTTP transport
Add to your MCP client config (Claude Code, Codex, Cursor, etc.):
Security
servo-fetch blocks all private and reserved IP ranges (RFC 6890), strips credentials from URLs, validates redirect targets, and sanitizes all output against terminal escape injection (CVE-2021-42574). See SECURITY.md for details.
Limitations
- Best suited for documentation, blogs, and SSR sites
- Some SPAs with complex client-side rendering may not fully render
- Servo's web compatibility is improving monthly