Skip to main content

RemoteMultimodalConfig

Struct RemoteMultimodalConfig 

Source
pub struct RemoteMultimodalConfig {
Show 34 fields pub include_html: bool, pub html_max_bytes: usize, pub include_url: bool, pub include_title: bool, pub include_screenshot: Option<bool>, pub temperature: f32, pub max_tokens: u16, pub request_json_object: bool, pub best_effort_json_extract: bool, pub reasoning_effort: Option<ReasoningEffort>, pub max_skills_per_round: usize, pub max_skill_context_chars: usize, pub max_rounds: usize, pub retry: RetryPolicy, pub capture_profiles: Vec<CaptureProfile>, pub model_policy: ModelPolicy, pub post_plan_wait_ms: u64, pub max_inflight_requests: Option<usize>, pub extra_ai_data: bool, pub extraction_prompt: Option<String>, pub extraction_schema: Option<ExtractionSchema>, pub screenshot: bool, pub tool_calling_mode: ToolCallingMode, pub html_diff_mode: HtmlDiffMode, pub planning_mode: Option<PlanningModeConfig>, pub synthesis_config: Option<SynthesisConfig>, pub confidence_strategy: Option<ConfidenceRetryStrategy>, pub self_healing: Option<SelfHealingConfig>, pub concurrent_execution: bool, pub relevance_gate: bool, pub relevance_prompt: Option<String>, pub url_prefilter: bool, pub url_prefilter_batch_size: usize, pub url_prefilter_max_tokens: u16,
}
Expand description

Runtime configuration for RemoteMultimodalEngine.

This struct controls:

  1. what context is captured (URL/title/HTML),
  2. how chat completion is requested (temperature/max tokens/JSON mode),
  3. how long the engine loops and retries,
  4. capture/model selection policies.

The engine should be able to export this config to users, and it should be safe to merge with user-provided prompts.

Fields§

§include_html: bool

Whether to include cleaned HTML in the model input.

§html_max_bytes: usize

Maximum number of bytes of cleaned HTML to include (global default).

A CaptureProfile may override this with its own html_max_bytes.

§include_url: bool

Whether to include the current URL in the model input.

§include_title: bool

Whether to include the current document title in the model input.

§include_screenshot: Option<bool>

Whether to include screenshots in the LLM request.

When None (default), automatically detects based on model name. Vision models (gpt-4o, claude-3, etc.) will receive screenshots, while text-only models will not.

Set to Some(true) to always include screenshots. Set to Some(false) to never include screenshots.

§temperature: f32

Sampling temperature used by the remote/local model.

§max_tokens: u16

Maximum tokens the model is allowed to generate for the plan.

§request_json_object: bool

If true, include response_format: {"type":"json_object"} in the request.

Some local servers ignore or reject this; disable if you see 400 errors.

§best_effort_json_extract: bool

Best-effort JSON extraction (strip fences / extract {...}).

§reasoning_effort: Option<ReasoningEffort>

Optional explicit reasoning effort for supported models/endpoints.

When set, outbound requests include reasoning: {"effort":"low|medium|high"}. Leave None to avoid sending provider-specific reasoning controls.

§max_skills_per_round: usize

Maximum number of skills to inject per round (default 3). Only the highest-priority matching skills are included.

§max_skill_context_chars: usize

Maximum characters for skill context injection per round (default 4000). Prevents large skill collections from bloating the system prompt.

§max_rounds: usize

Maximum number of plan/execute/re-capture rounds before giving up.

Each round is:

  1. capture state
  2. ask model for plan
  3. execute steps
  4. optionally wait
  5. re-capture and decide whether complete
§retry: RetryPolicy

Retry policy for model output parsing failures and/or execution failures.

§capture_profiles: Vec<CaptureProfile>

Capture profiles to try across attempts.

If empty, the engine should build a sensible default list.

§model_policy: ModelPolicy

Model selection policy (small/medium/large).

The engine may choose a model size depending on constraints such as latency limits, cost tier, and whether retries are escalating.

§post_plan_wait_ms: u64

Optional: wait after executing a plan before re-capturing state (ms).

This is useful for pages that animate, load asynchronously, or perform challenge transitions after clicks.

§max_inflight_requests: Option<usize>

Maximum number of concurrent LLM HTTP requests for this engine instance. If None, no throttling is applied.

§extra_ai_data: bool

Enable extraction mode to return structured data from pages.

When enabled, the model is instructed to include an extracted field in its JSON response containing data extracted from the page.

§extraction_prompt: Option<String>

Optional custom extraction prompt appended to the system prompt.

Example: “Extract all product names and prices as a JSON array.”

§extraction_schema: Option<ExtractionSchema>

Optional JSON schema for structured extraction output.

When provided, the model is instructed to return the extracted field conforming to this schema. This enables type-safe extraction.

§screenshot: bool

Take a screenshot after automation completes and include it in results.

§tool_calling_mode: ToolCallingMode

Tool calling mode for structured action output.

  • JsonObject (default): Use JSON object mode
  • ToolCalling: Use OpenAI-compatible tool/function calling
  • Auto: Auto-select based on model capabilities
§html_diff_mode: HtmlDiffMode

HTML diff mode for condensed page state.

When enabled, sends only HTML changes after the first round, potentially reducing tokens by 50-70%.

§planning_mode: Option<PlanningModeConfig>

Planning mode configuration.

When enabled, allows the LLM to plan multiple steps upfront, reducing round-trips. Set to None to disable.

§synthesis_config: Option<SynthesisConfig>

Multi-page synthesis configuration.

When configured, enables analyzing multiple pages in a single LLM call. Set to None to disable.

§confidence_strategy: Option<ConfidenceRetryStrategy>

Confidence-based retry strategy.

When configured, uses confidence scores to make smarter retry decisions. Set to None for default retry behavior.

§self_healing: Option<SelfHealingConfig>

Self-healing configuration for automatic selector repair.

When enabled, failed selectors trigger an LLM call to diagnose and suggest alternatives. Set to None to disable.

§concurrent_execution: bool

Enable concurrent execution of independent actions.

When true, actions without dependencies can run in parallel using tokio::JoinSet.

§relevance_gate: bool

Enable relevance gating for crawled pages. When enabled, the LLM returns "relevant": true|false indicating whether the page is relevant to the crawl/extraction goals. Irrelevant pages can have their budget refunded.

§relevance_prompt: Option<String>

Optional custom relevance criteria prompt. When None, defaults to judging against extraction_prompt or general context.

§url_prefilter: bool

Enable URL-level pre-filtering before HTTP fetch. When enabled alongside relevance_gate, URLs are classified by the text model BEFORE fetching. Irrelevant URLs are skipped entirely.

§url_prefilter_batch_size: usize

Batch size for URL classification calls (default 20).

§url_prefilter_max_tokens: u16

Max tokens for URL classification response (default 200).

Implementations§

Source§

impl RemoteMultimodalConfig

Source

pub fn new() -> RemoteMultimodalConfig

Create a new config with default settings.

Source

pub fn fast() -> RemoteMultimodalConfig

Create a config optimized for maximum speed and efficiency.

Enables all performance-positive features:

  • ToolCallingMode::Auto for reliable action parsing
  • HtmlDiffMode::Auto for 50-70% token reduction
  • ConfidenceRetryStrategy for smarter retries
  • concurrent_execution for parallel action execution

These features have zero or positive performance impact.

Source

pub fn fast_with_planning() -> RemoteMultimodalConfig

Create a config optimized for maximum speed with planning enabled.

Includes all fast() features plus:

  • PlanningModeConfig for multi-step planning (fewer round-trips)
  • SelfHealingConfig for auto-repair of failed selectors

Best for complex multi-step automations.

Source

pub fn is_extraction_only(&self) -> bool

Returns true when the config is set up for pure data extraction (extraction enabled, single round). Used to auto-detect extraction-only mode and optimize prompts / screenshot handling.

Source

pub fn with_html(self, include: bool) -> RemoteMultimodalConfig

Set whether to include HTML.

Source

pub fn with_html_max_bytes(self, bytes: usize) -> RemoteMultimodalConfig

Set maximum HTML bytes.

Source

pub fn with_temperature(self, temp: f32) -> RemoteMultimodalConfig

Set temperature.

Source

pub fn with_max_tokens(self, tokens: u16) -> RemoteMultimodalConfig

Set max tokens.

Source

pub fn with_reasoning_effort( self, effort: Option<ReasoningEffort>, ) -> RemoteMultimodalConfig

Set explicit reasoning effort for supported models/endpoints.

Source

pub fn with_max_rounds(self, rounds: usize) -> RemoteMultimodalConfig

Set max rounds.

Source

pub fn with_retry(self, retry: RetryPolicy) -> RemoteMultimodalConfig

Set retry policy.

Source

pub fn with_model_policy(self, policy: ModelPolicy) -> RemoteMultimodalConfig

Set model policy.

Source

pub fn with_extraction(self, enabled: bool) -> RemoteMultimodalConfig

Enable extraction mode.

Source

pub fn with_extraction_prompt( self, prompt: impl Into<String>, ) -> RemoteMultimodalConfig

Set extraction prompt.

Source

pub fn with_extraction_schema( self, schema: ExtractionSchema, ) -> RemoteMultimodalConfig

Set extraction schema.

Source

pub fn with_screenshot(self, enabled: bool) -> RemoteMultimodalConfig

Enable/disable screenshots.

Source

pub fn with_include_screenshot( self, include: Option<bool>, ) -> RemoteMultimodalConfig

Set whether to include screenshots in LLM requests.

  • Some(true): Always include screenshots
  • Some(false): Never include screenshots
  • None: Auto-detect based on model name (default)
Source

pub fn add_capture_profile(&mut self, profile: CaptureProfile)

Add a capture profile.

Source

pub fn with_tool_calling_mode( self, mode: ToolCallingMode, ) -> RemoteMultimodalConfig

Set tool calling mode.

Source

pub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> RemoteMultimodalConfig

Set HTML diff mode for condensed page state.

Source

pub fn with_planning_mode( self, config: PlanningModeConfig, ) -> RemoteMultimodalConfig

Enable planning mode with configuration.

Source

pub fn with_synthesis_config( self, config: SynthesisConfig, ) -> RemoteMultimodalConfig

Enable multi-page synthesis with configuration.

Source

pub fn with_confidence_strategy( self, strategy: ConfidenceRetryStrategy, ) -> RemoteMultimodalConfig

Set confidence-based retry strategy.

Source

pub fn with_self_healing( self, config: SelfHealingConfig, ) -> RemoteMultimodalConfig

Enable self-healing with configuration.

Source

pub fn with_concurrent_execution(self, enabled: bool) -> RemoteMultimodalConfig

Enable/disable concurrent execution of independent actions.

Source

pub fn with_relevance_gate( self, prompt: Option<String>, ) -> RemoteMultimodalConfig

Enable relevance gating with optional custom criteria prompt.

Source

pub fn with_url_prefilter( self, batch_size: Option<usize>, ) -> RemoteMultimodalConfig

Enable URL-level pre-filtering before HTTP fetch. Requires relevance_gate to also be enabled.

Trait Implementations§

Source§

impl Clone for RemoteMultimodalConfig

Source§

fn clone(&self) -> RemoteMultimodalConfig

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for RemoteMultimodalConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
Source§

impl Default for RemoteMultimodalConfig

Source§

fn default() -> RemoteMultimodalConfig

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for RemoteMultimodalConfig

Source§

fn deserialize<__D>( __deserializer: __D, ) -> Result<RemoteMultimodalConfig, <__D as Deserializer<'de>>::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl PartialEq for RemoteMultimodalConfig

Source§

fn eq(&self, other: &RemoteMultimodalConfig) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl Serialize for RemoteMultimodalConfig

Source§

fn serialize<__S>( &self, __serializer: __S, ) -> Result<<__S as Serializer>::Ok, <__S as Serializer>::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl StructuralPartialEq for RemoteMultimodalConfig

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,