pub struct Crawler<T: Scraper> { /* fields omitted */ }
Expand description
The crawler that is responsible for driving the requests to completion and
providing the crawl response for the Scraper
.
Implementations
Create a new crawler following the config
Whether this crawler respects the domains robots.txt rules
Whether non 2xx responses are treated as failures and are not being scraped
Send a crawling request whose html response and context is returned to the scraper again
Submit a complete crawling job that is driven to completion and directly returned once finished.
This queues in a GET request for the url
, without any state attached
This queues in a GET request for the url
with state attached
This queues in a whole request with no state attached
This queues in a whole request with a state attached