Skip to main content

Crate crw_renderer

Crate crw_renderer 

Source
Expand description

HTTP and headless-browser rendering engine for the CRW web scraper.

Provides a FallbackRenderer that fetches pages via plain HTTP and optionally re-renders them through a CDP-based headless browser when SPA content is detected.

  • http_only — Simple HTTP fetcher using reqwest
  • detector — Heuristic SPA shell detection (empty body, framework markers)
  • cdp — Chrome DevTools Protocol renderer (LightPanda, Playwright, Chrome) (requires cdp feature)
  • traitsPageFetcher trait for pluggable backends

§Feature flags

FlagDescription
cdpEnables CDP WebSocket rendering via tokio-tungstenite

§Example

use crw_core::config::RendererConfig;
use crw_renderer::FallbackRenderer;
use std::collections::HashMap;

use crw_core::config::StealthConfig;
let config = RendererConfig::default();
let stealth = StealthConfig::default();
let renderer = FallbackRenderer::new(&config, "crw/0.1", None, &stealth)?;
let deadline = crw_core::Deadline::from_request_ms(8000);
let result = renderer.fetch("https://example.com", &HashMap::new(), None, None, None, deadline).await?;
println!("status: {}", result.status_code);

Modules§

blocklist
Resource blocklist for CDP Fetch.requestPaused interception.
breaker
Sliding-window circuit breaker with multi-probe half-open and linear-back-off cooldown. See plans/breaker-cascade-fix.md Iter 3.
detector
host_limiter
Process-wide per-host rate-limiter and concurrency cap.
http_only
preference
Per-host renderer preference learning.
traits

Structs§

FallbackRenderer
Composite renderer that tries multiple backends in order.

Statics§

REQUEST_COUNTRY
Per-request country code (ISO 3166-1 alpha-2, lowercase) for the chrome_proxy tier’s CDP auth pump. Set by FallbackRenderer::fetch when a ScrapeRequest.country is present; read in cdp.rs while composing DataImpulse credentials. Task-local so child tasks spawned by the pool inherit it without trait-signature churn.