Struct url_crawler::Crawler[][src]

pub struct Crawler { /* fields omitted */ }

A configurable parallel web crawler.

Crawling does not occur until this type is consumed by the crawl method.

Methods

impl Crawler
[src]

Initializes a new crawler with a default thread count of 4.

Set flags for configuring the crawler.

Specifies the number of fetcher threads to use.

Notes

  • If the input is 0, 1 thread will be used.
  • The default thread count is 4 when not using this method.

Allow the caller to handle errors.

Notes

Returning false will stop the crawler.

Enables filtering items based on their filename.

Notes

Returning false will prevent the item from being fetched.

Enables filtering items based on their filename and requested headers.

Notes

Returning false will prevent the item from being scraped / returned.

Important traits for CrawlIter

Initializes the crawling, returning an iterator of discovered files.

The crawler will continue to crawl in background threads even while the iterator is not being pulled from.

Auto Trait Implementations

impl Send for Crawler

impl Sync for Crawler