Web Crawler Library
This Rust library is a simple web crawler that checks the validity of routes on a given website. It reads a list of routes from a file, constructs full URLs by appending the routes to a base URL, and sends HTTP GET requests to check whether the routes are valid.
Features
- Reads a list of routes from a text file.
- Constructs URLs by combining the base URL and the routes.
- Makes HTTP GET requests to check if the routes are valid.
- Tracks visited URLs to avoid re-crawling.
- Returns the number of valid routes.
Installation
1. Add webcrawler to Your Cargo.toml
If you're using this library in your own Rust project, add the following to your Cargo.toml under [dependencies]:
[]
= { = "0.1.1" }
= { = "0.11", = ["blocking"] }
= "0.3.30"
= "2.2"
Usage
1. Import the Web Crawler
In your main.rs or any other Rust file, import the web crawler library:
use WebCrawler;
1. Example usage
Here is an example of how to use the WebCrawler to check the validity of routes:
use WebCrawler;
License
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions
Contributions are welcome! If you have any improvements, bug fixes, or feature suggestions, feel free to open an issue or submit a pull request.
Crates.io
You can find this crate and the latest version on crates.io.