pub struct SpiderPage { /* private fields */ }Expand description
Browser tab abstraction with full automation API.
Wraps a ProtocolAdapter and exposes high-level navigation, content
extraction, click/input/scroll primitives, wait helpers, and viewport
control. The adapter can be swapped atomically via [set_adapter] during
browser rotation without dropping inflight references.
Implementations§
Source§impl SpiderPage
impl SpiderPage
Sourcepub fn new(adapter: ProtocolAdapter) -> Self
pub fn new(adapter: ProtocolAdapter) -> Self
Create a new SpiderPage wrapping the given protocol adapter.
Sourcepub fn from_arc(adapter: Arc<ProtocolAdapter>) -> Self
pub fn from_arc(adapter: Arc<ProtocolAdapter>) -> Self
Create a new SpiderPage from an already-Arc-wrapped adapter.
Sourcepub async fn goto_fast(&self, url: &str) -> Result<()>
pub async fn goto_fast(&self, url: &str) -> Result<()>
Navigate without waiting for full page load (5 s max wait).
Use with [content_with_early_return] for SPAs that never fire
loadEventFired.
Sourcepub async fn goto_dom(&self, url: &str) -> Result<()>
pub async fn goto_dom(&self, url: &str) -> Result<()>
Navigate and return as soon as DOMContentLoaded fires (3 s max).
Fastest option – the DOM shell is ready but subresources may still
load. Pair with [content_with_early_return] or
[content_with_network_idle] for best results.
Sourcepub async fn go_forward(&self) -> Result<()>
pub async fn go_forward(&self) -> Result<()>
Go forward in browser history.
Sourcepub async fn content(&self, wait_ms: u64, min_length: usize) -> Result<String>
pub async fn content(&self, wait_ms: u64, min_length: usize) -> Result<String>
Get the full page HTML, ensuring the page is ready first.
Waits for network idle + DOM stability, then checks content quality. If the content seems incomplete (too short or looks like a loading state), does incremental waits with exponential backoff before returning.
wait_ms– Max time to wait for readiness (default 8000). Pass 0 to skip readiness checks and return immediately.min_length– Minimum content length to consider “good” (default 1000).
Sourcepub async fn raw_content(&self) -> Result<String>
pub async fn raw_content(&self) -> Result<String>
Get the raw page HTML without any readiness waiting. Use this when you need immediate access or have already waited.
Sourcepub async fn content_with_early_return(
&self,
max_wait_ms: u64,
min_content_length: usize,
poll_interval_ms: u64,
) -> Result<String>
pub async fn content_with_early_return( &self, max_wait_ms: u64, min_content_length: usize, poll_interval_ms: u64, ) -> Result<String>
Poll for content with early return – for SPAs that never fire
loadEventFired.
Instead of waiting for a full page load event, this polls for HTML content at regular intervals and returns as soon as sufficient content is available. Useful for timeout retries where the page loads data asynchronously.
max_wait_ms– Max time to poll (default 15 s).min_content_length– Minimum HTML length to accept (default 500).poll_interval_ms– Interval between polls (default 2 s).
Sourcepub async fn content_with_network_idle(
&self,
max_wait_ms: u64,
min_content_length: usize,
interstitial_budget_ms: u64,
) -> Result<String>
pub async fn content_with_network_idle( &self, max_wait_ms: u64, min_content_length: usize, interstitial_budget_ms: u64, ) -> Result<String>
Get content using network idle detection + polling hybrid approach.
Best for heavy SPAs: uses PerformanceObserver + MutationObserver
to detect when the page stops loading, combined with content-length
thresholds.
Strategy:
- Wait for
readyState=interactive(DOM parsed) - Start network+DOM idle monitoring (400 ms silence threshold)
- Poll HTML length – return early if sufficient + idle
- Interstitial detection with configurable wait budget
max_wait_ms– Max total time to wait (default 20 s).min_content_length– Minimum HTML length to accept (default 1000).interstitial_budget_ms– Max time to wait for interstitials to resolve (default 16 s, use 30 s for retries).
Sourcepub async fn screenshot(&self) -> Result<String>
pub async fn screenshot(&self) -> Result<String>
Capture a screenshot as base64 PNG.
Sourcepub async fn evaluate(&self, expression: &str) -> Result<Value>
pub async fn evaluate(&self, expression: &str) -> Result<Value>
Evaluate arbitrary JavaScript and return the result.
Sourcepub async fn click_at(&self, x: f64, y: f64) -> Result<()>
pub async fn click_at(&self, x: f64, y: f64) -> Result<()>
Click at specific viewport coordinates.
Sourcepub async fn dblclick(&self, selector: &str) -> Result<()>
pub async fn dblclick(&self, selector: &str) -> Result<()>
Double-click an element by CSS selector.
Sourcepub async fn right_click(&self, selector: &str) -> Result<()>
pub async fn right_click(&self, selector: &str) -> Result<()>
Right-click an element by CSS selector.
Sourcepub async fn click_and_hold(&self, selector: &str, hold_ms: u64) -> Result<()>
pub async fn click_and_hold(&self, selector: &str, hold_ms: u64) -> Result<()>
Click and hold an element for a duration.
Useful for long-press interactions, drag initiation, and mobile-style gestures.
selector– CSS selector of the element.hold_ms– Duration in milliseconds to hold (default 1000).
Sourcepub async fn click_and_hold_at(
&self,
x: f64,
y: f64,
hold_ms: u64,
) -> Result<()>
pub async fn click_and_hold_at( &self, x: f64, y: f64, hold_ms: u64, ) -> Result<()>
Click and hold at specific viewport coordinates for a duration.
x– X coordinate (CSS pixels).y– Y coordinate (CSS pixels).hold_ms– Duration in milliseconds to hold (default 1000).
Sourcepub async fn click_all(&self, selector: &str) -> Result<()>
pub async fn click_all(&self, selector: &str) -> Result<()>
Click all elements matching a selector.
Sourcepub async fn fill(&self, selector: &str, value: &str) -> Result<()>
pub async fn fill(&self, selector: &str, value: &str) -> Result<()>
Fill a form field – focus, clear existing value, type new value.
Sourcepub async fn type_text(&self, value: &str) -> Result<()>
pub async fn type_text(&self, value: &str) -> Result<()>
Type text into the currently focused element.
Sourcepub async fn press(&self, key: &str) -> Result<()>
pub async fn press(&self, key: &str) -> Result<()>
Press a named key (e.g. “Enter”, “Tab”, “Escape”).
Sourcepub async fn select(&self, selector: &str, value: &str) -> Result<()>
pub async fn select(&self, selector: &str, value: &str) -> Result<()>
Select an option in a <select> element.
Sourcepub async fn drag(&self, from_selector: &str, to_selector: &str) -> Result<()>
pub async fn drag(&self, from_selector: &str, to_selector: &str) -> Result<()>
Drag from one element to another.
Sourcepub async fn scroll_y(&self, pixels: i64) -> Result<()>
pub async fn scroll_y(&self, pixels: i64) -> Result<()>
Scroll vertically by pixels (positive = down).
Sourcepub async fn scroll_x(&self, pixels: i64) -> Result<()>
pub async fn scroll_x(&self, pixels: i64) -> Result<()>
Scroll horizontally by pixels (positive = right).
Sourcepub async fn scroll_to_point(&self, x: f64, y: f64) -> Result<()>
pub async fn scroll_to_point(&self, x: f64, y: f64) -> Result<()>
Scroll to absolute page coordinates.
Sourcepub async fn wait_for_selector(
&self,
selector: &str,
timeout_ms: u64,
) -> Result<()>
pub async fn wait_for_selector( &self, selector: &str, timeout_ms: u64, ) -> Result<()>
Wait for a CSS selector to appear in the DOM.
Wait for navigation/page load (simple delay).
Sourcepub async fn wait_for_ready(&self, timeout_ms: u64) -> Result<()>
pub async fn wait_for_ready(&self, timeout_ms: u64) -> Result<()>
Wait until the page is fully loaded and DOM is stable.
Checks:
document.readyState === 'complete'- DOM content length stabilizes (no changes for 500 ms)
Use after goto() for SPAs and dynamic pages to ensure all
content is rendered before extracting HTML.
Sourcepub async fn wait_for_content(
&self,
min_length: usize,
timeout_ms: u64,
) -> Result<()>
pub async fn wait_for_content( &self, min_length: usize, timeout_ms: u64, ) -> Result<()>
Wait until page content exceeds a minimum length. Useful for SPAs where content loads asynchronously.
Sourcepub async fn wait_for_network_idle(&self, timeout_ms: u64) -> Result<()>
pub async fn wait_for_network_idle(&self, timeout_ms: u64) -> Result<()>
Wait for network idle + DOM stability (cross-platform).
Uses the Performance/Resource Timing API and MutationObserver
(works in both Chrome/CDP and Firefox/BiDi) to detect when:
document.readyState === 'complete'- No new network resources loading (
PerformanceObserver) - DOM mutations have settled
This is more comprehensive than [wait_for_ready] – it also
catches lazy-loaded images, XHR/fetch requests, and
script-injected content.
Sourcepub async fn set_viewport(
&self,
width: u32,
height: u32,
device_scale_factor: f64,
mobile: bool,
) -> Result<()>
pub async fn set_viewport( &self, width: u32, height: u32, device_scale_factor: f64, mobile: bool, ) -> Result<()>
Set the viewport dimensions.
Sourcepub async fn query_selector(&self, selector: &str) -> Result<Option<String>>
pub async fn query_selector(&self, selector: &str) -> Result<Option<String>>
Query a single element and return its outer HTML.
Sourcepub async fn query_selector_all(&self, selector: &str) -> Result<Vec<String>>
pub async fn query_selector_all(&self, selector: &str) -> Result<Vec<String>>
Query all matching elements and return their outer HTML.
Sourcepub async fn text_content(&self, selector: &str) -> Result<Option<String>>
pub async fn text_content(&self, selector: &str) -> Result<Option<String>>
Get text content of an element.
Sourcepub async fn extract_fields(
&self,
fields: &[(&str, FieldSelector<'_>)],
) -> Result<HashMap<String, Option<String>>>
pub async fn extract_fields( &self, fields: &[(&str, FieldSelector<'_>)], ) -> Result<HashMap<String, Option<String>>>
Extract multiple fields from the page in a single evaluate call.
Each entry maps a key name to a FieldSelector. Returns a map of
key → value (or None if the element was not found).
§Example
use std::collections::HashMap;
use spider_browser::page::FieldSelector;
let data = page.extract_fields(&[
("title", "#productTitle".into()),
("price", ".a-price .a-offscreen".into()),
("image", FieldSelector::Attr {
selector: "#main-image",
attribute: "src",
}),
]).await?;
println!("{:?}", data.get("title"));Sourcepub fn route_message(&self, data: &str)
pub fn route_message(&self, data: &str)
Route an incoming WebSocket message to the underlying protocol session.
Sourcepub fn set_adapter(&self, adapter: ProtocolAdapter)
pub fn set_adapter(&self, adapter: ProtocolAdapter)
Replace the adapter (used during browser switching).
Atomically swaps the underlying ProtocolAdapter so that
inflight operations on the old adapter can finish while new
operations use the replacement.
Sourcepub fn set_adapter_arc(&self, adapter: Arc<ProtocolAdapter>)
pub fn set_adapter_arc(&self, adapter: Arc<ProtocolAdapter>)
Replace the adapter with an already-Arc-wrapped instance.
Sourcepub fn is_interstitial_content(html: &str) -> bool
pub fn is_interstitial_content(html: &str) -> bool
Detect challenge interstitials that may auto-resolve (e.g. Cloudflare “Just a moment…”).
These pages show briefly before redirecting to the real content.
Sourcepub fn is_rate_limit_content(html: &str) -> bool
pub fn is_rate_limit_content(html: &str) -> bool
Detect site-level rate limiting in page content.
Browser rotation gives a new profile which bypasses per-session rate limits.