Skip to main content

Module request

Module request 

Source
Expand description

Request and callback types for the spider crawl pipeline.

This module defines the core data that flows through the crawler:

  • Request – a URL to fetch, together with scheduling priority, session routing, deduplication fingerprint, retry state, and an optional callback.
  • Callback – a boxed closure that turns a fetched Response into zero or more SpiderOutput values.
  • SpiderOutput – the two things a callback can produce: a scraped data item (Item) or a follow-up request (FollowRequest).

Requests use a builder pattern (Request::new(url).with_priority(10)) and are ordered by priority so the Scheduler always processes the most important URLs first.

Structs§

Request
A crawl request with URL, priority, metadata, and optional callback.

Enums§

SpiderOutput
The result of processing a response: either a scraped data item or a follow-up request to enqueue.

Type Aliases§

Callback
A boxed closure that processes an HTTP response and returns spider outputs.