Expand description
webfetch — token-efficient web content fetcher.
The defining feature is reference-style URL preservation: instead of
stripping links to their domain (losing the ability to cite or follow
them) or expanding full URLs inline (wasting tokens), links are replaced
with compact [N] markers and collected into a recoverable reference list.
Re-exports§
pub use fetch::fetch_page;
Modules§
- compress
- convert
- Output dispatcher: routes an HTML document to the requested format.
- extract
- fetch
- guard
- SSRF guard for the fetch path.
- media
- Decide how to treat a fetched body. The HTML extractor only makes sense
for HTML; running it over a JSON API response, a raw
.txt, or a Markdown file would mangle or drop the content. We classify byContent-Typewhen present, and sniff the body otherwise. - refs
- Shared reference-style URL preservation.
- types
Functions§
- convert_
body - Convert a fetched body to a
FetchResult, choosing how to treat it based on itsContent-Type(or a sniff of the body). HTML is extracted; JSON is pretty-printed; other text is passed through verbatim; binary is summarized. - convert_
html - Convert already-fetched HTML into a
FetchResultwithout any network I/O. - fetch_
and_ convert - Fetch a URL and convert it according to
options. - parse_
content_ type - Parse a content-type string (“text” | “markdown” | “structured”).