pub async fn scrape_multiple_with_limit(
client: &ClientWithMiddleware,
urls: &[Url],
config: &ScraperConfig,
) -> Result<Vec<ScrapedContent>>Expand description
Scrape multiple URLs with concurrency control
Uses buffer_unordered to limit concurrent requests, preventing:
- File descriptor exhaustion
- HDD thrashing (for systems with mechanical drives)
- Anti-bot detection (DDoS-like patterns)
Following config-externalize: Concurrency is configurable via ScraperConfig. Following async-concurrency-limit: Uses buffer_unordered for concurrency control.
§Arguments
client- HTTP client with retry middlewareurls- URLs to scrapeconfig- Scraper configuration
§Returns
Vec<ScrapedContent>- All successfully scraped content
§Note
Failed URLs are logged but don’t stop the entire batch.