llmweb
Extract any webpage to structured data — headless Chrome + LLM.
Install
[]
= "0.1"
Default config reads OPENAI_API_KEY. For any OpenAI-compatible gateway
(DeepSeek, Groq, OpenRouter, z.ai, vLLM, Ollama, ...) use LlmWeb::with_client:
use ;
let client = with_config;
let llmweb = with_client;
Example
use ;
use ;
use json;
// json_schema mode requires an object root; wrap arrays in a named field.
async
URL-based shortcut (library opens the tab for you): llmweb.exec(url, schema).await?.
Query: rust programming language
[0] Rust Programming Language
https://www.rust-lang.org/en-US
Rust's rich type system and ownership model guarantee memory-safety and thread-safety — enabling you to eliminate many classes of bugs at compile-time.
[1] Rust (programming language) - Wikipedia
https://en.wikipedia.org/wiki/Rust_(programming_language)
Rust is a general-purpose programming language which emphasizes performance, type safety, concurrency, and memory safety.
[2] Rust - A Living Hell - The Perspective From A Programmer Of 30 Years
https://www.reddit.com/r/learnrust/comments/1binxlv/rust_a_living_hell_the_perspective_from_a/
Mar 19, 2024 · This has been the worst experience learning a programming language that I have ever had by far. I found absolutely no joy in it in any shape or form.
[3] The Rust Programming Language - Reddit
https://www.reddit.com/r/rust/
r/rust: A place for all things related to the Rust programming language—an open-source systems language that emphasizes performance, reliability, and…
... (10 results total)
Features
- Extract —
exec/exec_on_tab/exec_on_html: one LLM call → typedR. - Stream —
stream/stream_on_tab/stream_on_html: incremental partial-JSON snapshots viapartial_stream. - Codegen —
generate→ JS extractor string;run_scriptreplays it with zero LLM cost. - Recipe —
generate_recipe→ declarative CSS-selector recipe;run_recipeexecutes it in pure Rust (no eval, no LLM). - 5 preprocessing modes —
html(in-browser DOM cleanup),raw_html,markdown(viahtmd),text,image(base64 screenshot for vision models). - Multi-provider — any OpenAI-compatible endpoint via
LlmWeb::with_client. - Logging —
tracingintegration;RUST_LOG=llmweb=debugshows LLM raw responses.
Examples
| File | Demonstrates |
|---|---|
hn.rs |
Basic URL-shortcut extraction |
google.rs |
Browser + tab, real-world Google search |
v2ex_stream.rs |
Streaming partial output |
codegen.rs |
Generate JS extractor, replay offline |
recipe.rs |
Generate CSS-selector recipe |
inline_html.rs |
No browser — extract from a &str |
hn_custom.rs |
Custom LLM endpoint + API key |
CLI
Star History
License
MIT