pub enum StartRequests<'a> {
Urls(Vec<&'a str>),
Iter(Box<dyn Iterator<Item = Result<Request, SpiderError>> + Send + 'a>),
File(&'a str),
}Expand description
Core runtime types and traits used to define and run a crawl.
Initial request source returned by Spider::start_requests.
Use StartRequests::Urls for simple static seeds, StartRequests::Iter
when you need to construct full Request values or generate seeds
lazily, and StartRequests::File when you want to keep large seed lists
outside compiled code.
Variants§
Urls(Vec<&'a str>)
Fixed list of seed URLs.
Iter(Box<dyn Iterator<Item = Result<Request, SpiderError>> + Send + 'a>)
Direct request iterator supplied by the spider.
File(&'a str)
Path to a plain-text seed file (one URL per line).
Implementations§
Source§impl<'a> StartRequests<'a>
impl<'a> StartRequests<'a>
Sourcepub fn file(path: &'a str) -> StartRequests<'a>
pub fn file(path: &'a str) -> StartRequests<'a>
Creates a file-based source from a path string.
The file is expected to contain one URL per line. Empty lines and lines
starting with # are ignored.
Sourcepub fn into_iter(
self,
) -> Result<Box<dyn Iterator<Item = Result<Request, SpiderError>> + Send + 'a>, SpiderError>
pub fn into_iter( self, ) -> Result<Box<dyn Iterator<Item = Result<Request, SpiderError>> + Send + 'a>, SpiderError>
Resolves this source into a concrete request iterator.
URL strings are parsed eagerly as the iterator is consumed. Invalid file
entries become SpiderError::ConfigurationError items that preserve the
original line number.