spider-core
Core crawling engine for spider-lib: spider trait, crawler runtime, scheduler, builder, state, and stats.
Most users should start with spider-lib. Use spider-core directly when you want lower-level control over runtime composition.
Installation
[]
= "2.0.0"
Main Components
Spider: trait for crawl logic.Crawler: runtime engine that drives requests and parsing.CrawlerBuilder: runtime configuration and composition.Scheduler: request queueing and dedup behavior.CrawlerState: shared runtime state.StatCollector: runtime statistics.
Minimal Usage
use ;
use ;
;
;
Feature Flags
core(default)live-stats: enables in-place terminal stat updates.checkpoint: enables checkpoint/resume support.cookie-store: enablescookie_storeintegration.
[]
= { = "2.0.0", = ["checkpoint"] }
Custom Extension Guides
For extension points built around crawler composition, see:
- Custom downloader guide:
spider-downloader - Custom middleware guide:
spider-middleware - Custom pipeline guide:
spider-pipeline
Related Crates
License
MIT. See LICENSE.