Skip to main content

Module spider

Module spider 

Source
Expand description

Trait for defining custom web spiders in the spider-lib framework.

This module provides the Spider trait, which serves as the blueprint for creating custom web scrapers. A spider defines how a specific website (or a group of websites) should be crawled and how data should be extracted.

Implementors of the Spider trait must:

  • Specify the Item type (the data structure for scraped data).
  • Provide a list of start_urls or start_requests to begin the crawl.
  • Implement the parse method, which takes a Response and returns ParseOutput containing new Requests to follow and ScrapedItems.

Traitsยง

Spider
Defines the contract for a web spider.