pub trait RobotsMatchStrategy: Default {
// Required methods
fn match_allow(&self, path: &str, pattern: &str) -> i32;
fn match_disallow(&self, path: &str, pattern: &str) -> i32;
// Provided method
fn matches(path: &str, pattern: &str) -> bool { ... }
}Expand description
Create a RobotsMatcher with the default matching strategy.
The default matching strategy is longest-match as opposed to the former internet draft that provisioned first-match strategy. Analysis shows that longest-match, while more restrictive for crawlers, is what webmasters assume when writing directives. For example, in case of conflicting matches (both Allow and Disallow), the longest match is the one the user wants. For example, in case of a robots.txt file that has the following rules
Allow: /
Disallow: /cgi-binit’s pretty obvious what the webmaster wants: they want to allow crawl of every URI except /cgi-bin. However, according to the expired internet standard, crawlers should be allowed to crawl everything with such a rule.
Required Methods§
fn match_allow(&self, path: &str, pattern: &str) -> i32
fn match_disallow(&self, path: &str, pattern: &str) -> i32
Provided Methods§
Sourcefn matches(path: &str, pattern: &str) -> bool
fn matches(path: &str, pattern: &str) -> bool
Returns true if URI path matches the specified pattern. Pattern is anchored at the beginning of path. ‘$’ is special only at the end of pattern.
Since ‘path’ and ‘pattern’ are both externally determined (by the webmaster), we make sure to have acceptable worst-case performance.
use robotstxt::matcher::{LongestMatchRobotsMatchStrategy, RobotsMatchStrategy};
type Target = LongestMatchRobotsMatchStrategy;
assert_eq!(true, Target::matches("/", "/"));
assert_eq!(true, Target::matches("/abc", "/"));
assert_eq!(false, Target::matches("/", "/abc"));
assert_eq!(
true,
Target::matches("/google/robotstxt/tree/master", "/*/*/tree/master")
);
assert_eq!(
true,
Target::matches(
"/google/robotstxt/tree/master/index.html",
"/*/*/tree/master",
)
);
assert_eq!(
true,
Target::matches("/google/robotstxt/tree/master", "/*/*/tree/master$")
);
assert_eq!(
false,
Target::matches("/google/robotstxt/tree/master/abc", "/*/*/tree/master$")
);
assert_eq!(
false,
Target::matches("/google/robotstxt/tree/abc", "/*/*/tree/master")
);Dyn Compatibility§
This trait is not dyn compatible.
In older versions of Rust, dyn compatibility was called "object safety", so this trait is not object safe.