Expand description
Module for tracking the operational state of the crawler.
This module defines the CrawlerState struct, which provides a centralized
mechanism for monitoring the real-time activity of the web crawler. It
utilizes atomic counters to keep track of:
- The number of HTTP requests currently in flight (being downloaded).
- The number of responses actively being parsed by spiders.
- The number of scraped items currently being processed by pipelines.
This state information is crucial for determining when the crawler is idle and can be gracefully shut down, or when to trigger checkpointing.
Structs§
- Crawler
State - Represents the shared state of the crawler’s various actors.