LlmWeb

Struct LlmWeb 

Source
pub struct LlmWeb { /* private fields */ }
Expand description

The main struct for interacting with web pages and LLMs.

It holds the client for the LLM and provides methods to perform completions on web content.

Implementations§

Source§

impl LlmWeb

Source

pub fn new(name: &str) -> Self

Creates a new LlmWeb instance.

§Arguments
  • name - The name of the LLM model to use (e.g., “gemini-1.5-flash”).
Examples found in repository?
examples/v2ex.rs (line 29)
26async fn main() {
27    let schema_str = include_str!("../schemas/v2ex_schema.json");
28
29    let llmweb = LlmWeb::new("gemini-2.0-flash");
30    let structed_value: Vec<VXNA> = llmweb
31        .exec_from_schema_str("https://v2ex.com/go/vxna", schema_str)
32        .await
33        .unwrap();
34    println!("{:#?}", structed_value);
35}
More examples
Hide additional examples
examples/v2ex_stream.rs (line 32)
27async fn main() {
28    // Load the schema from an external file as a string.
29    let schema_str = include_str!("../schemas/v2ex_schema.json");
30    let schema: Value = serde_json::from_str(schema_str).unwrap();
31
32    let structed_value: Vec<VXNA> = LlmWeb::new("gemini-2.0-flash")
33        .stream("https://v2ex.com/go/vxna", schema)
34        .await
35        .unwrap();
36    println!("{:#?}", structed_value);
37}
examples/x.rs (line 18)
7async fn main() {
8    let schema_json = json!({
9        "type": "object",
10        "properties": {
11            "tweet_text": {
12            "type": "string",
13            },
14        },
15        "required": ["tweet_text"]
16    });
17
18    let llmweb = LlmWeb::new("gemini-2.0-flash");
19    let structed_value: Value = llmweb
20        .exec(
21            "https://x.com/ztgx5/status/1942242787317133452",
22            schema_json,
23        )
24        .await
25        .unwrap();
26    println!("{:#?}", structed_value);
27}
examples/hn.rs (line 17)
13async fn main() {
14    // Load the schema from an external file as a string.
15    let schema_str = include_str!("../schemas/hn_schema.json");
16
17    let llmweb = LlmWeb::new("gemini-2.0-flash");
18    eprintln!("Fetching from Hacker News and extracting stories...");
19
20    // Use the convenience method `exec_from_schema_str` which handles
21    // parsing the schema string internally.
22    let structed_value: Vec<Story> = llmweb
23        .exec_from_schema_str("https://news.ycombinator.com", schema_str)
24        .await
25        .unwrap();
26    println!("{:#?}", structed_value);
27}
Source

pub async fn exec<R>(&self, url: &str, scheme: Value) -> Result<R>

Fetches content from a URL, sends it to an LLM for processing based on a schema, and returns the structured data.

This function performs the following steps:

  1. Launches a headless browser.
  2. Navigates to the specified URL.
  3. Extracts the HTML content of the page.
  4. Sends the content and a JSON schema to the configured LLM.
  5. Parses the LLM’s JSON response into the specified Rust type R.
§Arguments
  • url - The URL of the web page to process.
  • scheme - A serializable object representing the JSON schema for data extraction. This is typically a serde_json::Value.
§Errors

This function can return an LlmWebError if any of the steps fail, such as browser errors, network issues, LLM API errors, or JSON deserialization errors.

Examples found in repository?
examples/x.rs (lines 20-23)
7async fn main() {
8    let schema_json = json!({
9        "type": "object",
10        "properties": {
11            "tweet_text": {
12            "type": "string",
13            },
14        },
15        "required": ["tweet_text"]
16    });
17
18    let llmweb = LlmWeb::new("gemini-2.0-flash");
19    let structed_value: Value = llmweb
20        .exec(
21            "https://x.com/ztgx5/status/1942242787317133452",
22            schema_json,
23        )
24        .await
25        .unwrap();
26    println!("{:#?}", structed_value);
27}
Source

pub async fn exec_from_schema_str<R>( &self, url: &str, schema_str: &str, ) -> Result<R>

A convenience method that accepts a schema as a string slice.

This method is useful when loading a schema from a file. It parses the string into a serde_json::Value and then calls the main completion method.

§Arguments
  • url - The URL of the web page to process.
  • schema_str - A string slice containing the JSON schema.
§Errors

Returns an error if the schema_str is not valid JSON, or if any of the underlying operations in completion fail.

Examples found in repository?
examples/v2ex.rs (line 31)
26async fn main() {
27    let schema_str = include_str!("../schemas/v2ex_schema.json");
28
29    let llmweb = LlmWeb::new("gemini-2.0-flash");
30    let structed_value: Vec<VXNA> = llmweb
31        .exec_from_schema_str("https://v2ex.com/go/vxna", schema_str)
32        .await
33        .unwrap();
34    println!("{:#?}", structed_value);
35}
More examples
Hide additional examples
examples/hn.rs (line 23)
13async fn main() {
14    // Load the schema from an external file as a string.
15    let schema_str = include_str!("../schemas/hn_schema.json");
16
17    let llmweb = LlmWeb::new("gemini-2.0-flash");
18    eprintln!("Fetching from Hacker News and extracting stories...");
19
20    // Use the convenience method `exec_from_schema_str` which handles
21    // parsing the schema string internally.
22    let structed_value: Vec<Story> = llmweb
23        .exec_from_schema_str("https://news.ycombinator.com", schema_str)
24        .await
25        .unwrap();
26    println!("{:#?}", structed_value);
27}
Source

pub async fn stream<R>(&self, url: &str, scheme: Value) -> Result<R>

Fetches content from a URL, sends it to an LLM for processing based on a schema, and returns the structured data.

This function performs the following steps:

  1. Launches a headless browser.
  2. Navigates to the specified URL.
  3. Extracts the HTML content of the page.
  4. Sends the content and a JSON schema to the configured LLM.
  5. Parses the LLM’s JSON response into the specified Rust type R.

This method is intended for streaming responses.

§Arguments
  • url - The URL of the web page to process.
  • scheme - A serializable object representing the JSON schema for data extraction. This is typically a serde_json::Value.
§Errors

This function can return an LlmWebError if any of the steps fail, such as browser errors, network issues, LLM API errors, or JSON deserialization errors.

Examples found in repository?
examples/v2ex_stream.rs (line 33)
27async fn main() {
28    // Load the schema from an external file as a string.
29    let schema_str = include_str!("../schemas/v2ex_schema.json");
30    let schema: Value = serde_json::from_str(schema_str).unwrap();
31
32    let structed_value: Vec<VXNA> = LlmWeb::new("gemini-2.0-flash")
33        .stream("https://v2ex.com/go/vxna", schema)
34        .await
35        .unwrap();
36    println!("{:#?}", structed_value);
37}

Auto Trait Implementations§

§

impl Freeze for LlmWeb

§

impl !RefUnwindSafe for LlmWeb

§

impl Send for LlmWeb

§

impl Sync for LlmWeb

§

impl Unpin for LlmWeb

§

impl !UnwindSafe for LlmWeb

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> ErasedDestructor for T
where T: 'static,