robotxt 0.1.1

The implementation of the Robots.txt (or URL exclusion) protocol.
Documentation

xwde: robotxt

Build Status Crate Docs Crate Version

The implementation of the robots.txt (or URL exclusion) protocol in the Rust programming language with the support of crawl-delay, sitemap and universal * match extensions (according to the RFC specification).

Examples

  • parse the user-agent in the provided robots.txt file:
use robotxt::Robots;

fn main() {
    let txt = r#"
      User-Agent: foobot
      Allow: /example/
      Disallow: /example/nope.txt
    "#.as_bytes();
    
    let r = Robots::from_slice(txt, "foobot");
    assert!(r.is_allowed("/example/yeah.txt"));
    assert!(!r.is_allowed("/example/nope.txt"));
}
  • build the new robots.txt file from provided directives:

Note : the builder is not yet implemented.

Links

Notes

The parser is based on Smerity/texting_robots.