website_crawler 0.1.0

crawl all urls on a website async & sync
docs.rs failed to build website_crawler-0.1.0
Please check the build logs for more information.
See Builds for ideas on how to fix a failed build, or Metadata for how to configure docs.rs builds.
If you believe this is docs.rs' fault, open an issue.
Visit the last successful build: website_crawler-0.9.9

crawler

crawls websites to gather all possible urls

Getting Started

Make sure to have Rust installed.

make sure to create a .env file and add CRAWL_URL=http://0.0.0.0:8080/api/website-crawl. replace CRAWL_URL with your production endpoint to accept results. A valid endpoint to accept the hook is required for the crawler to work.

  1. curl --proto '=https' --tlsv1.2 https://sh.rustup.rs -sSf | sh
  2. cargo run

Docker

you can start the service with docker by running docker build -t crawler . && docker run -dp 8000:8000 crawler

compose

use the docker image

jeffmendez19/crawler

Dependencies

API

crawl - async determine all urls in a website with a post hook

POST

http://localhost:8000/crawl

Body: { url: www.drake.com, id: 0 }

ENV

CARGO_RELEASE=false //determine if prod/dev build ROCKET_ENV=dev // determine api env CRAWL_URL="http://api:8080/api/website-crawl-background" // endpoint to send results

LICENSE

check the license file in the root of the project.