Skip to main content

Module builder

Module builder 

Source
Expand description

Builder API for assembling a Crawler.

CrawlerBuilder is where runtime composition happens: concurrency, downloader selection, middleware, pipelines, item limits, logging, and optional checkpointing all start here.

§Example

use spider_core::CrawlerBuilder;
use spider_middleware::rate_limit::RateLimitMiddleware;
use spider_pipeline::console::ConsolePipeline;
use spider_util::error::SpiderError;

async fn setup_crawler() -> Result<(), SpiderError> {
    let crawler = CrawlerBuilder::new(MySpider)
        .max_concurrent_downloads(10)
        .max_parser_workers(4)
        .add_middleware(RateLimitMiddleware::default())
        .add_pipeline(ConsolePipeline::new())
        .with_checkpoint_path("./crawl.checkpoint")
        .build()
        .await?;

    crawler.start_crawl().await
}

Structs§

CrawlerBuilder
A fluent builder for constructing Crawler instances.