Firecrawl Rust SDK
The Firecrawl Rust SDK is a library that allows you to easily scrape and crawl websites, and output the data in a format ready for use with language models (LLMs). It provides a simple and intuitive interface for interacting with the Firecrawl API.
Installation
To install the Firecrawl Rust SDK, add the following to your Cargo.toml
:
[]
= "^0.1"
= { = "^1", = ["full"] }
To add it in your codebase.
Usage
First, you need to obtain an API key from firecrawl.dev. Then, you need to initialize the FirecrawlApp
like so:
use FirecrawlApp;
async
Scraping a URL
To scrape a single URL, use the scrape_url
method. It takes the URL as a parameter and returns the scraped data as a Document
.
let scrape_result = app.scrape_url.await;
match scrape_result
Scraping with Extract
With Extract, you can easily extract structured data from any URL. You need to specify your schema in the JSON Schema format, using the serde_json::json!
macro.
let json_schema = json!;
let llm_extraction_options = ScrapeOptions ;
let llm_extraction_result = app
.scrape_url
.await;
match llm_extraction_result
Crawling a Website
To crawl a website, use the crawl_url
method. This will wait for the crawl to complete, which may take a long time based on your starting URL and your options.
let crawl_options = CrawlOptions ;
let crawl_result = app
.crawl_url
.await;
match crawl_result
Crawling asynchronously
To crawl without waiting for the result, use the crawl_url_async
method. It takes the same parameters, but it returns a CrawlAsyncRespone
struct, containing the crawl's ID. You can use that ID with the check_crawl_status
method to check the status at any time. Do note that completed crawls are deleted after 24 hours.
let crawl_id = app.crawl_url_async.await?.id;
// ... later ...
let status = app.check_crawl_status.await?;
if status.status == Completed else
Map a URL (Alpha)
Map all associated links from a starting URL.
let map_result = app
.map_url
.await;
match map_result
Error Handling
The SDK handles errors returned by the Firecrawl API and by our dependencies, and combines them into the FirecrawlError
enum, implementing Error
, Debug
and Display
. All of our methods return a Result<T, FirecrawlError>
.
Running the Tests with Cargo
To ensure the functionality of the Firecrawl Rust SDK, we have included end-to-end tests using cargo
. These tests cover various aspects of the SDK, including URL scraping, web searching, and website crawling.
Running the Tests
To run the tests, execute the following commands:
Contributing
Contributions to the Firecrawl Rust SDK are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.
License
The Firecrawl Rust SDK is open-source and released under the AGPL License.