servo-fetch
Fetch, render, and extract web content with an embedded Servo browser engine. No Chrome, no containers, no external processes.
Looking for the CLI? See servo-fetch-cli.
Features
- Real JS execution — SpiderMonkey runs JavaScript, parallel CSS engine computes layout
- Layout-aware extraction — strips navbars, sidebars, footers by rendered position
- Sync API — no async runtime required; wrap with
spawn_blockingfor async contexts - PDF auto-detection — URLs returning PDF are automatically extracted as text
- Typed errors —
Error::Timeout,Error::InvalidUrl, etc. for match-based retry logic - SSRF protection — blocks private IPs, reserved ranges, and metadata endpoints
Quick Start
let md = markdown?;
Examples
Fetch with options
use ;
use Duration;
let page = fetch?;
println!;
let md = page.markdown?;
Screenshot
use ;
let page = fetch?;
write?;
JavaScript execution
use ;
let page = fetch?;
println!;
Crawl a site
use ;
crawl_each?;
Error handling
use ;
match fetch
From async contexts
let page = spawn_blocking.await??;
Feature Flags
| Flag | Default | Description |
|---|---|---|
mcp |
off | MCP server support |
API Overview
| Function | Description |
|---|---|
markdown(url) |
Fetch → readable Markdown |
extract_json(url) |
Fetch → structured JSON |
text(url) |
Fetch → plain text (innerText) |
fetch(opts) |
Fetch with full options → Page |
crawl(opts) |
Crawl site → Vec<CrawlResult> |
crawl_each(opts, cb) |
Crawl site, streaming results |
See docs.rs for the full API reference and examples/ for complete runnable programs.
License
MIT OR Apache-2.0