[][src]Crate robotstxt

robots.txt parser for Rust

The robots.txt Exclusion Protocol is implemented as specified in http://www.robotstxt.org/norobots-rfc.txt.

This crate is based on https://github.com/messense/robotparser-rs

Installation

Add it to your Cargo.toml:

[dependencies]
robotstxt = "0.1"

Examples

use robotstxt::RobotFileParser;

fn main() {
    let parser = RobotFileParser::parse("
       User-agent: crawler1\n\
       Allow: /not_here/but_here\n\
       Disallow:/not_here/\n\
    ");
    assert!(parser.can_fetch("crawler1", "/not_here/but_here"));
    assert!(!parser.can_fetch("crawler1", "/not_here/no_way"));
}

Structs

RequestRate
RobotFileParser

robots.txt file parser