Skip to main content

Crate scrapling_fetch

Crate scrapling_fetch 

Source
Expand description

HTTP fetching layer for the scrapling-rs web scraping framework.

This crate handles everything between “I have a URL” and “I have a parsed HTML response.” It builds on top of wreq (a TLS-fingerprint-aware HTTP client) to make requests that look like they come from real browsers, automatically retries on failure, rotates proxies, and wraps the result in a Response that lazily parses the HTML body into a scrapling Selector.

§Crate architecture

The crate is organized into the following modules:

ModulePurpose
clientThe two main entry points: Fetcher (stateless, one client per request) and FetcherSession (persistent client with cookie jar). Also defines RequestConfig for per-request overrides.
configConfiguration types shared across the crate: FetcherConfig, its builder, Impersonate strategy, FollowRedirects policy, and ParserConfig.
errorA single FetchError enum and a Result type alias that every fallible function in this crate returns.
fingerprintGenerates realistic browser headers (User-Agent, Sec-Ch-Ua, etc.) so that requests survive bot-detection checks.
proxyProxy specification, ProxyRotator for cycling through a pool of proxies, and helpers like is_proxy_error.
responseThe Response struct – holds status, headers, cookies, and the raw body bytes. Provides lazy HTML parsing, CSS queries, and Markdown/text conversion.
statusA lookup table that maps HTTP status codes to their standard reason phrases.

§Quick start

use scrapling_fetch::Fetcher;

#[tokio::main]
async fn main() -> scrapling_fetch::Result<()> {
    let fetcher = Fetcher::new();
    let response = fetcher.get("https://example.com", None).await?;
    let titles = response.css("title");
    if let Some(title) = titles.first() {
        println!("{}", title.text());
    }
    Ok(())
}

Re-exports§

pub use client::Fetcher;
pub use client::FetcherSession;
pub use client::RequestConfig;
pub use config::FetcherConfig;
pub use config::FetcherConfigBuilder;
pub use config::FollowRedirects;
pub use config::Impersonate;
pub use config::ParserConfig;
pub use error::FetchError;
pub use error::Result;
pub use proxy::Proxy;
pub use proxy::ProxyRotator;
pub use proxy::cyclic_rotation;
pub use proxy::is_proxy_error;
pub use response::Response;
pub use status::status_text;

Modules§

client
HTTP client implementations for making requests.
config
Configuration types for the HTTP fetcher.
error
Error types for the scrapling-fetch crate.
fingerprint
Browser fingerprint generation for stealth HTTP requests.
proxy
Proxy configuration and rotation for HTTP requests.
response
HTTP response type with lazy HTML parsing.
status
HTTP status code to reason phrase mapping.