scrapling_fetch/lib.rs
1//! HTTP fetching layer for the scrapling-rs web scraping framework.
2//!
3//! This crate handles everything between "I have a URL" and "I have a parsed HTML
4//! response." It builds on top of [`wreq`] (a TLS-fingerprint-aware HTTP client) to
5//! make requests that look like they come from real browsers, automatically retries
6//! on failure, rotates proxies, and wraps the result in a [`Response`] that lazily
7//! parses the HTML body into a scrapling [`Selector`](scrapling::selector::Selector).
8//!
9//! # Crate architecture
10//!
11//! The crate is organized into the following modules:
12//!
13//! | Module | Purpose |
14//! |---|---|
15//! | [`client`] | The two main entry points: [`Fetcher`] (stateless, one client per request) and [`FetcherSession`] (persistent client with cookie jar). Also defines [`RequestConfig`] for per-request overrides. |
16//! | [`config`] | Configuration types shared across the crate: [`FetcherConfig`], its builder, [`Impersonate`] strategy, [`FollowRedirects`] policy, and [`ParserConfig`]. |
17//! | [`error`] | A single [`FetchError`] enum and a [`Result`] type alias that every fallible function in this crate returns. |
18//! | [`fingerprint`] | Generates realistic browser headers (User-Agent, Sec-Ch-Ua, etc.) so that requests survive bot-detection checks. |
19//! | [`proxy`] | [`Proxy`] specification, [`ProxyRotator`] for cycling through a pool of proxies, and helpers like [`is_proxy_error`]. |
20//! | [`response`] | The [`Response`] struct -- holds status, headers, cookies, and the raw body bytes. Provides lazy HTML parsing, CSS queries, and Markdown/text conversion. |
21//! | [`status`] | A lookup table that maps HTTP status codes to their standard reason phrases. |
22//!
23//! # Quick start
24//!
25//! ```rust,ignore
26//! use scrapling_fetch::Fetcher;
27//!
28//! #[tokio::main]
29//! async fn main() -> scrapling_fetch::Result<()> {
30//! let fetcher = Fetcher::new();
31//! let response = fetcher.get("https://example.com", None).await?;
32//! let titles = response.css("title");
33//! if let Some(title) = titles.first() {
34//! println!("{}", title.text());
35//! }
36//! Ok(())
37//! }
38//! ```
39
40pub mod client;
41pub mod config;
42pub mod error;
43pub mod fingerprint;
44pub mod proxy;
45pub mod response;
46pub mod status;
47
48pub use client::{Fetcher, FetcherSession, RequestConfig};
49pub use config::{FetcherConfig, FetcherConfigBuilder, FollowRedirects, Impersonate, ParserConfig};
50pub use error::{FetchError, Result};
51pub use proxy::{Proxy, ProxyRotator, cyclic_rotation, is_proxy_error};
52pub use response::Response;
53pub use status::status_text;