Expand description
A website to crawl.
Structs§
- Channel
Guard - Guard a channel from closing until all concurrent operations are done.
- DEFAULT_
PERMITS - The default Semaphore limits.
- Website
- Represents a website to crawl and gather all links or page content.
Enums§
- Crawl
Status - the active status of the crawl.
- Cron
Type - The type of cron job to run
- OnShould
Crawl Callback - Callback closure or function pointer that determines if a link should be crawled or not.
- Process
Link Status - The link activity for the crawl.
- Website
Meta Info - Generic website meta info for handling retries.
Traits§
- OnShould
Crawl Closure - Callback closure that determines if a link should be crawled or not.
Functions§
- calc_
limits - calculate the base limits
- channel_
send_ page - Channel broadcast send the Page to receivers.
- is_
safe_ javascript_ challenge - check if the page is a javascript challenge
- set_
interface - Bind connections only on the specified network interface.
Type Aliases§
- OnLink
Find Callback - On link find callback rewrite a url if it meets a condition.