Skip to main content

Crate fetchkit

Crate fetchkit 

Source
Expand description

FetchKit - AI-friendly web content fetching library

This crate provides a reusable library API for fetching web content, with optional HTML to markdown/text conversion optimized for LLM consumption.

§Quick Start

use fetchkit::{FetchRequest, fetch};

let request = FetchRequest::new("https://example.com").as_markdown();
let response = fetch(request).await?;
println!("Content: {}", response.content.unwrap_or_default());

§Tool Builder

For more control, use the ToolBuilder to configure options:

use fetchkit::{FetchRequest, ToolBuilder};

let tool = ToolBuilder::new()
    .enable_markdown(true)
    .user_agent("MyBot/1.0")
    .block_prefix("https://blocked.example.com")
    .build();

let request = FetchRequest::new("https://example.com");
let response = tool.execute(request).await?;

§HTML Conversion

Convert HTML to markdown or plain text directly:

use fetchkit::{html_to_markdown, html_to_text};

let html = "<h1>Hello</h1><p>World</p>";
let md = html_to_markdown(html);
assert!(md.contains("# Hello"));

let text = html_to_text(html);
assert!(text.contains("Hello"));

§Fetcher System

FetchKit uses a pluggable fetcher system where specialized fetchers handle specific URL patterns. The FetcherRegistry dispatches requests to the appropriate fetcher based on URL matching.

Built-in fetchers:

Re-exports§

pub use client::batch_fetch;
pub use client::batch_fetch_with_options;
pub use client::fetch;
pub use client::fetch_with_options;
pub use client::FetchOptions;
pub use fetchers::ArXivFetcher;
pub use fetchers::DefaultFetcher;
pub use fetchers::DocsSiteFetcher;
pub use fetchers::Fetcher;
pub use fetchers::FetcherRegistry;
pub use fetchers::GitHubCodeFetcher;
pub use fetchers::GitHubIssueFetcher;
pub use fetchers::GitHubRepoFetcher;
pub use fetchers::HackerNewsFetcher;
pub use fetchers::PackageRegistryFetcher;
pub use fetchers::RSSFeedFetcher;
pub use fetchers::StackOverflowFetcher;
pub use fetchers::TwitterFetcher;
pub use fetchers::WikipediaFetcher;
pub use fetchers::YouTubeFetcher;
pub use file_saver::FileSaveError;
pub use file_saver::FileSaver;
pub use file_saver::LocalFileSaver;
pub use file_saver::SaveResult;

Modules§

client
HTTP client for FetchKit
fetchers
Fetcher system for specialized content fetching
file_saver
File saving abstractions for FetchKit

Structs§

DnsPolicy
Policy for DNS resolution and IP validation
FetchRequest
Request to fetch a URL
FetchResponse
Response from a fetch operation
PageLink
A link extracted from the page with its text and href.
PageMetadata
Structured metadata extracted from an HTML page.
Tool
Configured FetchKit tool
ToolBuilder
Builder for configuring the FetchKit tool
ToolExecution
Single-use runtime execution for one tool call.
ToolImage
Output image returned by the toolkit-library contract.
ToolOutput
Structured tool output for the toolkit-library contract.
ToolOutputMetadata
Consumer-only metadata returned by the toolkit-library contract.
ToolService
Generic JSON args → JSON result service.
ToolStatus
Status update during tool execution

Enums§

FetchError
Errors that can occur during fetch operations
HttpMethod
HTTP method for the request
ToolError
Errors returned by the toolkit-library contract surface.

Constants§

DEFAULT_USER_AGENT
Default User-Agent string
TOOL_DESCRIPTION
Backward-compatible full description string with file-saving enabled.

Statics§

TOOL_LLMTXT
Backward-compatible help document with file-saving enabled.

Functions§

extract_headings
Second pass specifically for heading extraction (cheap — headings are sparse). Called after the main metadata extraction to keep the main function clean.
extract_metadata
Extract structured metadata from HTML in a single pass.
html_to_markdown
Convert HTML to markdown
html_to_text
Convert HTML to plain text
strip_boilerplate
Strip boilerplate elements from HTML, keeping only main content.