Expand description
HTTP fetching layer for the scrapling-rs web scraping framework.
This crate handles everything between “I have a URL” and “I have a parsed HTML
response.” It builds on top of wreq (a TLS-fingerprint-aware HTTP client) to
make requests that look like they come from real browsers, automatically retries
on failure, rotates proxies, and wraps the result in a Response that lazily
parses the HTML body into a scrapling Selector.
§Crate architecture
The crate is organized into the following modules:
| Module | Purpose |
|---|---|
client | The two main entry points: Fetcher (stateless, one client per request) and FetcherSession (persistent client with cookie jar). Also defines RequestConfig for per-request overrides. |
config | Configuration types shared across the crate: FetcherConfig, its builder, Impersonate strategy, FollowRedirects policy, and ParserConfig. |
error | A single FetchError enum and a Result type alias that every fallible function in this crate returns. |
fingerprint | Generates realistic browser headers (User-Agent, Sec-Ch-Ua, etc.) so that requests survive bot-detection checks. |
proxy | Proxy specification, ProxyRotator for cycling through a pool of proxies, and helpers like is_proxy_error. |
response | The Response struct – holds status, headers, cookies, and the raw body bytes. Provides lazy HTML parsing, CSS queries, and Markdown/text conversion. |
status | A lookup table that maps HTTP status codes to their standard reason phrases. |
§Quick start
ⓘ
use scrapling_fetch::Fetcher;
#[tokio::main]
async fn main() -> scrapling_fetch::Result<()> {
let fetcher = Fetcher::new();
let response = fetcher.get("https://example.com", None).await?;
let titles = response.css("title");
if let Some(title) = titles.first() {
println!("{}", title.text());
}
Ok(())
}Re-exports§
pub use client::Fetcher;pub use client::FetcherSession;pub use client::RequestConfig;pub use config::FetcherConfig;pub use config::FetcherConfigBuilder;pub use config::FollowRedirects;pub use config::Impersonate;pub use config::ParserConfig;pub use error::FetchError;pub use error::Result;pub use proxy::Proxy;pub use proxy::ProxyRotator;pub use proxy::cyclic_rotation;pub use proxy::is_proxy_error;pub use response::Response;pub use status::status_text;
Modules§
- client
- HTTP client implementations for making requests.
- config
- Configuration types for the HTTP fetcher.
- error
- Error types for the scrapling-fetch crate.
- fingerprint
- Browser fingerprint generation for stealth HTTP requests.
- proxy
- Proxy configuration and rotation for HTTP requests.
- response
- HTTP response type with lazy HTML parsing.
- status
- HTTP status code to reason phrase mapping.