scrapling-fetch 0.1.0

HTTP fetcher with TLS impersonation for scrapling
Documentation

HTTP fetching layer for the scrapling-rs web scraping framework.

This crate handles everything between "I have a URL" and "I have a parsed HTML response." It builds on top of [wreq] (a TLS-fingerprint-aware HTTP client) to make requests that look like they come from real browsers, automatically retries on failure, rotates proxies, and wraps the result in a [Response] that lazily parses the HTML body into a scrapling Selector.

Crate architecture

The crate is organized into the following modules:

Module Purpose
[client] The two main entry points: [Fetcher] (stateless, one client per request) and [FetcherSession] (persistent client with cookie jar). Also defines [RequestConfig] for per-request overrides.
[config] Configuration types shared across the crate: [FetcherConfig], its builder, [Impersonate] strategy, [FollowRedirects] policy, and [ParserConfig].
[error] A single [FetchError] enum and a [Result] type alias that every fallible function in this crate returns.
[fingerprint] Generates realistic browser headers (User-Agent, Sec-Ch-Ua, etc.) so that requests survive bot-detection checks.
[proxy] [Proxy] specification, [ProxyRotator] for cycling through a pool of proxies, and helpers like [is_proxy_error].
[response] The [Response] struct -- holds status, headers, cookies, and the raw body bytes. Provides lazy HTML parsing, CSS queries, and Markdown/text conversion.
[status] A lookup table that maps HTTP status codes to their standard reason phrases.

Quick start

use scrapling_fetch::Fetcher;

#[tokio::main]
async fn main() -> scrapling_fetch::Result<()> {
    let fetcher = Fetcher::new();
    let response = fetcher.get("https://example.com", None).await?;
    let titles = response.css("title");
    if let Some(title) = titles.first() {
        println!("{}", title.text());
    }
    Ok(())
}