spider 1.1.0

Multithreaded Web spider crawler written in Rust.
Documentation

Spider

crate version

Multithreaded Web spider crawler written in Rust.

Depensencies

$ apt install openssl libssl-dev

Usage

Add this dependency to your Cargo.toml file.

[dependencies]
spider = "1.0.2"

and then you'll be able to use library. Here a simple example

extern crate spider;

use spider::website::Website;

fn main() {
    let mut localhost = Website::new("http://localhost:4000");
    localhost.crawl();

    for page in localhost.get_pages() {
        println!("- {}", page.get_url());
    }
}

TODO

  • multi-threaded system
  • respect robot.txt file
  • add configuratioon object for polite delay, etc..
  • parse command line arguments

Contribute

I am open-minded to any contribution. Just fork & commit on another branch.