Crate sws_crawler

Source
Expand description

Web crawler with plugable scraping logic.

The main function crawl_site crawls and scraps web pages. It is configured through a CrawlerConfig and a Scrapable implementation. The latter defines the Seed used for crawling, as well as the scraping logic. Note that robots.txt seeds are supported and exposed through texting_robots::Robot in the CrawlingContext and ScrapingContext.

Re-exports§

pub use anyhow;
pub use texting_robots;

Structs§

CountedTx
CrawlerConfig
CrawlingContext
ScrapingContext

Enums§

OnError
PageLocation
Seed
Sitemap
Throttle

Traits§

Scrapable

Functions§

crawl_site