Expand description
Fetch, render, and extract web content with an embedded Servo browser engine. No Chrome, no containers, no external processes.
let md = servo_fetch::markdown("https://example.com")?;Modules§
- extract
- Content extraction — converts raw HTML into readable Markdown or structured JSON.
- sanitize
- Strips terminal escape sequences and control characters from output.
Structs§
- Console
Message - Browser console message captured during page load.
- Crawl
Error - Error from a failed crawl attempt.
- Crawl
Options - Options for crawling a site.
- Crawl
Page - Successfully crawled page.
- Crawl
Result - Result for a single crawled page.
- Fetch
Options - Options for a single page fetch.
- MapOptions
- Options for URL discovery (sitemap + link extraction, no rendering).
- Mapped
Url - A discovered URL from sitemap or link extraction.
- Network
Policy - Network access policy — determines which hosts are reachable.
- Page
- Rendered page returned by
fetch.
Enums§
- Console
Level - Console message severity.
- Error
- Errors from servo-fetch operations.
Functions§
- crawl
- Crawl a site and collect all results.
- crawl_
each - Crawl a site, invoking
on_pagefor each result as it arrives. - extract_
json - Fetch a URL and return structured JSON.
- fetch
- Fetch a single page via the embedded Servo engine.
- init
- Set the network policy. Must be called at most once, before any engine use.
- map
- Discover URLs on a site via sitemaps and link extraction (no rendering).
- markdown
- Fetch a URL and return readable Markdown.
- text
- Fetch a URL and return plain text (
document.body.innerText). - validate_
url - Validate a URL for fetching. Rejects disallowed schemes and private addresses
based on the policy set via
init.
Type Aliases§
- Result
- A specialized
Resulttype for servo-fetch.