Browser automation crate for the scrapling-rs web scraping framework.
This crate provides high-level browser automation built on top of Playwright, giving you two session types for fetching fully-rendered web pages:
-
[
DynamicSession] -- a standard Playwright-driven browser that executes JavaScript, waits for network activity to settle, and returns the final DOM. Use this when the target site does not employ bot-detection. -
[
StealthySession] -- extendsDynamicSessionwith anti-detection measures such as WebRTC leak prevention, canvas fingerprint noise, automation-flag removal, and an automatic Cloudflare Turnstile solver. Use this when sites actively block headless browsers.
Architecture overview
┌──────────────┐
│ Your code │
└──────┬───────┘
│ .fetch(url)
┌──────────────┴──────────────┐
│ DynamicSession / StealthySession │ (fetcher.rs)
└──────────────┬──────────────┘
│
┌─────────────────┼─────────────────┐
▼ ▼ ▼
engine.rs intercept.rs page_pool.rs
(launch opts) (request blocking) (page tracking)
│ │
▼ ▼
constants.rs ad_domains.rs
(CLI flags) (blocklist data)
Configuration starts with [BrowserConfig] (or [StealthConfig] for stealth sessions).
Per-request overrides are expressed via [FetchParams], which are merged with the
session-level config into [ResolvedFetchParams] before each navigation.
After navigation completes, the [response_factory] module extracts the page's HTML,
status code, headers, and cookies into a unified [scrapling_fetch::Response] that the
rest of the scrapling pipeline can parse and query.
Quick example
use ;
# async