Skip to main content

Crate servo_fetch

Crate servo_fetch 

Source
Expand description

Fetch, render, and extract web content as Markdown, JSON, or screenshots with an embedded Servo browser engine. No Chromium, no containers, no external processes.

let md = servo_fetch::markdown("https://example.com")?;

Modules§

extract
Content extraction — converts raw HTML into readable Markdown or structured JSON.
sanitize
Strips terminal escape sequences and control characters from output.
schema
CSS-selector schema extraction.

Structs§

ConsoleMessage
Browser console message captured during page load.
CrawlError
Error from a failed crawl attempt.
CrawlOptions
Options for crawling a site.
CrawlPage
Successfully crawled page.
CrawlResult
Result for a single crawled page.
FetchOptions
Options for a single page fetch.
MapOptions
Options for URL discovery (sitemap + link extraction, no rendering).
MappedUrl
A discovered URL from sitemap or link extraction.
NetworkPolicy
Network access policy — determines which hosts are reachable.
Page
Rendered page returned by fetch.

Enums§

ConsoleLevel
Console message severity.
Error
Errors from servo-fetch operations.

Functions§

crawl
Crawl a site and collect all results.
crawl_each
Crawl a site, invoking on_page for each result as it arrives.
extract_json
Fetch a URL and return structured JSON.
fetch
Fetch a single page via the embedded Servo engine.
init
Set the network policy. Must be called at most once, before any engine use.
map
Discover URLs on a site via sitemaps and link extraction (no rendering).
markdown
Fetch a URL and return readable Markdown.
text
Fetch a URL and return plain text (document.body.innerText).
validate_url
Validate a URL for fetching. Rejects disallowed schemes and private addresses based on the policy set via crate::init.

Type Aliases§

Result
A specialized Result type for servo-fetch.