Skip to main content

Module web

Module web 

Source
Expand description

Web & research context layer.

Turns an arbitrary URL (web page or YouTube video) into compressed, citation-backed context for an agent. The flow is:

  1. url_guard validates the URL and blocks SSRF targets.
  2. fetch downloads it (bounded, manual-redirect, SSRF-revalidated) — or youtube pulls a transcript for video URLs.
  3. html_to_text renders HTML to clean Markdown.
  4. distill applies the requested research-compression mode.
  5. citation attaches source attribution.

The single entry point is read_url; the crate::tools::registered::ctx_url_read MCP tool is a thin wrapper over it.

Modules§

citation
Evidence / citation metadata attached to every fetched document.
distill
Extractive research-compression modes for prose and transcripts.
fetch
Bounded, SSRF-aware HTTP fetch built on ureq.
html_to_text
Dependency-free HTML → Markdown / plain-text conversion.
pdf
PDF → text extraction for the research context layer.
url_guard
URL validation and SSRF protection for outbound fetches.
youtube
YouTube transcript adapter (no API key required).

Structs§

ReadOptions
Parameters for read_url.
ReadResult
Result of a successful read_url.

Enums§

ReadMode
How fetched content should be distilled before returning.

Constants§

DEFAULT_MAX_ITEMS
Default number of items for facts / quotes modes.
DEFAULT_MAX_TOKENS
Default token budget for returned content.

Functions§

read_url
Fetch and distill a URL into citation-backed context.