Expand description
Module for managing crawler checkpoints.
This module defines the data structures (SchedulerCheckpoint, Checkpoint)
and functions for saving and loading the state of a crawler. Checkpoints enable
the crawler to gracefully recover from interruptions or to resume a crawl
at a later time. They capture the state of the scheduler (pending requests,
visited URLs, salvaged requests) and the item pipelines.
Structs§
- Checkpoint
- A complete checkpoint of the crawler’s state.
- Scheduler
Checkpoint - A snapshot of the scheduler’s state.