1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
//! Browser automation crate for the scrapling-rs web scraping framework.
//!
//! This crate provides high-level browser automation built on top of Playwright, giving
//! you two session types for fetching fully-rendered web pages:
//!
//! - [`DynamicSession`] -- a standard Playwright-driven browser that executes JavaScript,
//! waits for network activity to settle, and returns the final DOM. Use this when the
//! target site does not employ bot-detection.
//!
//! - [`StealthySession`] -- extends `DynamicSession` with anti-detection measures such as
//! WebRTC leak prevention, canvas fingerprint noise, automation-flag removal, and an
//! automatic Cloudflare Turnstile solver. Use this when sites actively block headless
//! browsers.
//!
//! # Architecture overview
//!
//! ```text
//! ┌──────────────┐
//! │ Your code │
//! └──────┬───────┘
//! │ .fetch(url)
//! ┌──────────────┴──────────────┐
//! │ DynamicSession / StealthySession │ (fetcher.rs)
//! └──────────────┬──────────────┘
//! │
//! ┌─────────────────┼─────────────────┐
//! ▼ ▼ ▼
//! engine.rs intercept.rs page_pool.rs
//! (launch opts) (request blocking) (page tracking)
//! │ │
//! ▼ ▼
//! constants.rs ad_domains.rs
//! (CLI flags) (blocklist data)
//! ```
//!
//! Configuration starts with [`BrowserConfig`] (or [`StealthConfig`] for stealth sessions).
//! Per-request overrides are expressed via [`FetchParams`], which are merged with the
//! session-level config into [`ResolvedFetchParams`] before each navigation.
//!
//! After navigation completes, the [`response_factory`] module extracts the page's HTML,
//! status code, headers, and cookies into a unified [`scrapling_fetch::Response`] that the
//! rest of the scrapling pipeline can parse and query.
//!
//! # Quick example
//!
//! ```rust,no_run
//! use scrapling_browser::{BrowserConfig, DynamicSession};
//!
//! # async fn run() -> scrapling_browser::Result<()> {
//! let config = BrowserConfig {
//! headless: true,
//! disable_resources: true,
//! ..Default::default()
//! };
//!
//! let mut session = DynamicSession::new(config)?;
//! session.start().await?;
//!
//! let response = session.fetch("https://example.com", None).await?;
//! println!("status: {}", response.status);
//!
//! session.close().await?;
//! # Ok(())
//! # }
//! ```
pub use ;
pub use ;
pub use ;
pub use ;