llmweb
Extract any webpage to structured data in Rust & LLM

[!IMPORTANT]
This project is under active development and APIs may change.
✨ Key Features
- 🤖 Schema-Driven Extraction
- 🌐 Multi-Provider LLM Support
- ⚡ High-Performance & Async
- 💻 Simple & Powerful CLI
- 🦀 Rust-Powered Reliability
- 📄 Streaming
Installation
Add to your Cargo.toml:
[dependencies]
llmweb = "0.1"
- Configure API Key(different providers choose one):
export OPENAI_API_KEY="sk-your-key-here" export ANTHROPIC_API_KEY="sk-ant-your-key" export GEMINI_API_KEY="your-google-key" export COHERE_API_KEY="your-cohere-key" export GROQ_API_KEY="gsk-your-key" export XAI_API_KEY="xai-your-key" export DEEPSEEK_API_KEY="your-deepseek-key"
- Pick the model you want to use:
let model = "gemini-2.0-flash";
- Create
LlmWeb instance with the model:
let llmweb = LlmWeb::new(model);
Example - V2EX
use llmweb::LlmWeb;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VXNA {
pub username: String,
pub avatar_url: String,
pub profile_url: String,
pub title: String,
pub topic_url: String,
pub topic_id: u64,
pub relative_time: String,
pub reply_count: u32,
pub last_replier: Option<String>,
}
#[tokio::main]
async fn main() {
let schema_str = include_str!("../schemas/v2ex_schema.json");
let llmweb = LlmWeb::new("gemini-2.0-flash");
let structed_value: Vec<VXNA> = llmweb
.exec_from_schema_str("https://v2ex.com/go/vxna", schema_str)
.await
.unwrap();
println!("{:#?}", structed_value);
}
Streaming
#[tokio::main]
async fn main() {
let schema_str = include_str!("../schemas/v2ex_schema.json");
let schema: Value = serde_json::from_str(schema_str).unwrap();
let structed_value: Vec<VXNA> = LlmWeb::new("gemini-2.0-flash")
.stream("https://v2ex.com/go/vxna", schema)
.await
.unwrap();
println!("{:#?}", structed_value);
}
Example - HN
use llmweb::LlmWeb;
use serde::{Deserialize, Serialize};
#[derive(Debug, Serialize, Deserialize)]
struct Story {
title: String,
points: f32,
by: Option<String>,
comments_url: Option<String>,
}
#[tokio::main]
async fn main() {
let schema_str = include_str!("../schemas/hn_schema.json");
let llmweb = LlmWeb::new("gemini-2.0-flash");
eprintln!("Fetching from Hacker News and extracting stories...");
let structed_value: Vec<Story> = llmweb
.exec_from_schema_str("https://news.ycombinator.com", schema_str)
.await
.unwrap();
println!("{:#?}", structed_value);
}
Cli
./target/debug/llmweb-cli --schema-file schemas/hn_schema.json https://news.ycombinator.com
Examples
More examples can be found in the Examples directory.
Schemas
More schemas can be found in the Schemas directory.
Star History

Contributing
We welcome contributions! Please see our CONTRIBUTING.md for more details on how to get started.
License
This project is licensed under the MIT License - see the LICENSE file for details.