Skip to main content

Crate iscrawl

Crate iscrawl 

Source
Expand description

Fast crawler/bot detection from User-Agent strings.

is_crawler returns true for crawlers/bots and false for human browsers. With the database feature, crawler_info separately returns matching Crawlerdex metadata.

§Example

use iscrawl::is_crawler;

assert!(is_crawler("Googlebot/2.1 (+http://www.google.com/bot.html)"));
assert!(!is_crawler(
    "Mozilla/5.0 (X11; Linux x86_64; rv:115.0) Gecko/20100101 Firefox/115.0"
));

§Heuristic

  1. Empty input: crawler.
  2. Input over 512 bytes: false (oversized, not classified).
  3. Crawler keyword present (bot, crawl, spider, +http, @, …): crawler.
  4. No Mozilla//Opera/ prefix and no browser engine token: crawler.
  5. Mozilla//Opera/ prefix lacking engine and (compatible;: crawler.
  6. Otherwise: browser.

Heuristic bool API plus optional database lookup.

Functions§

is_crawler
Returns true if user_agent looks like a crawler/bot, false if it looks like a human browser.