Skip to main content

Crate servo_fetch

Crate servo_fetch 

Source
Expand description

Fetch, render, and extract web content with an embedded Servo browser engine. No Chrome, no containers, no external processes.

let md = servo_fetch::markdown("https://example.com")?;

Modules§

extract
Content extraction — converts raw HTML into readable Markdown or structured JSON.
sanitize
Strips terminal escape sequences and control characters from output.

Structs§

ConsoleMessage
Browser console message captured during page load.
CrawlError
Error from a failed crawl attempt.
CrawlOptions
Options for crawling a site.
CrawlPage
Successfully crawled page.
CrawlResult
Result for a single crawled page.
FetchOptions
Options for a single page fetch.
MapOptions
Options for URL discovery (sitemap + link extraction, no rendering).
MappedUrl
A discovered URL from sitemap or link extraction.
NetworkPolicy
Network access policy — determines which hosts are reachable.
Page
Rendered page returned by fetch.

Enums§

ConsoleLevel
Console message severity.
Error
Errors from servo-fetch operations.

Functions§

crawl
Crawl a site and collect all results.
crawl_each
Crawl a site, invoking on_page for each result as it arrives.
extract_json
Fetch a URL and return structured JSON.
fetch
Fetch a single page via the embedded Servo engine.
init
Set the network policy. Must be called at most once, before any engine use.
map
Discover URLs on a site via sitemaps and link extraction (no rendering).
markdown
Fetch a URL and return readable Markdown.
text
Fetch a URL and return plain text (document.body.innerText).
validate_url
Validate a URL for fetching. Rejects disallowed schemes and private addresses based on the policy set via init.

Type Aliases§

Result
A specialized Result type for servo-fetch.