Struct robotstxt_with_cache::matcher::RobotsMatcher [−][src]
pub struct RobotsMatcher<S: RobotsMatchStrategy> { /* fields omitted */ }
Expand description
RobotsMatcher - matches robots.txt against URLs.
The Matcher uses a default match strategy for Allow/Disallow patterns which is the official way of Google crawler to match robots.txt. It is also possible to provide a custom match strategy.
The entry point for the user is to call one of the allowed_by_robots methods that return directly if a URL is being allowed according to the robots.txt and the crawl agent. The RobotsMatcher can be re-used for URLs/robots.txt but is not thread-safe.
Implementations
pub fn allowed_by_robots(
&mut self,
robots_body: &str,
user_agents: Vec<&str>,
url: &str
) -> bool where
Self: RobotsParseHandler,
[src]
pub fn allowed_by_robots(
&mut self,
robots_body: &str,
user_agents: Vec<&str>,
url: &str
) -> bool where
Self: RobotsParseHandler,
[src]Returns true if ‘url’ is allowed to be fetched by any member of the “user_agents” vector. ‘url’ must be %-encoded according to RFC3986.
pub fn one_agent_allowed_by_robots(
&mut self,
robots_txt: &str,
user_agent: &str,
url: &str
) -> bool where
Self: RobotsParseHandler,
[src]
pub fn one_agent_allowed_by_robots(
&mut self,
robots_txt: &str,
user_agent: &str,
url: &str
) -> bool where
Self: RobotsParseHandler,
[src]Do robots check for ‘url’ when there is only one user agent. ‘url’ must be %-encoded according to RFC3986.
Verifies that the given user agent is valid to be matched against robots.txt. Valid user agent strings only contain the characters [a-zA-Z_-].
Returns the line that matched or 0 if none matched.
Trait Implementations
Returns the “default value” for a type. Read more
Any other unrecognized name/value pairs.
Auto Trait Implementations
impl<S> RefUnwindSafe for RobotsMatcher<S> where
S: RefUnwindSafe,
impl<S> Send for RobotsMatcher<S> where
S: Send,
impl<S> Sync for RobotsMatcher<S> where
S: Sync,
impl<S> Unpin for RobotsMatcher<S> where
S: Unpin,
impl<S> UnwindSafe for RobotsMatcher<S> where
S: UnwindSafe,