Expand description
Request and callback types for the spider crawl pipeline.
This module defines the core data that flows through the crawler:
Request– a URL to fetch, together with scheduling priority, session routing, deduplication fingerprint, retry state, and an optional callback.Callback– a boxed closure that turns a fetchedResponseinto zero or moreSpiderOutputvalues.SpiderOutput– the two things a callback can produce: a scraped data item (Item) or a follow-up request (FollowRequest).
Requests use a builder pattern (Request::new(url).with_priority(10)) and are
ordered by priority so the Scheduler always
processes the most important URLs first.
Structs§
- Request
- A crawl request with URL, priority, metadata, and optional callback.
Enums§
- Spider
Output - The result of processing a response: either a scraped data item or a follow-up request to enqueue.
Type Aliases§
- Callback
- A boxed closure that processes an HTTP response and returns spider outputs.