Goodreads Metadata Scraper
An async Rust library to fetch and scrape book metadata from Goodreads by using an ISBN, Goodreads book ID, or a combination of title and author. This library is useful for applications that need book data from Goodreads without access to an official API.
Features
- Retrieve metadata by ISBN
- Retrieve metadata by Goodreads ID
- Retrieve metadata by title (optionally with author for better accuracy)
- Structured metadata output with fields such as title, author, publication year, and more
- Query builder pattern for flexible request customization
Installation
Add this crate to your Cargo.toml:
[]
= "0.1.0"
Usage
Here are a few examples of how to use the library. The primary entry point is MetadataRequestBuilder, which allows you to specify search criteria before calling execute to fetch metadata.
Fetching Metadata by ISBN
use MetadataRequestBuilder;
let isbn = "9780141381473";
let metadata = default
.with_isbn
.execute
.await?
.expect;
assert_eq!;
println!;
Fetching Metadata by Goodreads ID
use MetadataRequestBuilder;
let goodreads_id = "175254";
let metadata = default
.with_id
.execute
.await?
.expect;
assert_eq!;
println!;
Fetching Metadata by Title and Author
For better accuracy, you can also specify an author along with the title:
use MetadataRequestBuilder;
let title = "The Last Magician";
let author = "Lisa Maxwell";
let metadata = default
.with_title
.with_author
.execute
.await?
.expect;
assert_eq!;
println!;
Error Handling
This crate uses a custom error type, ScraperError, which handles errors that may occur during the metadata fetching and parsing process. ScraperError includes:
FetchError: Errors during HTTP requests (fromreqwest)ParseError: HTML parsing errors (fromscraper)SerializeError: JSON serialization errors (fromserde_json)
Limitations
- As this library relies on web scraping, any changes in Goodreads' HTML structure may break functionality.
- This library is intended for personal or small-scale use, as frequent requests to Goodreads may be rate-limited.
Note: When running tests, it is highly recommended to run them with the --test-threads=1 flag to avoid rate-limiting issues with Goodreads.
License
This project is licensed under the GNU General Public License (GPL). See the LICENSE file for more details.
This library provides an accessible alternative for retrieving Goodreads book metadata, enabling developers to integrate Goodreads data without an official API. Feel free to contribute by submitting issues or pull requests!