Struct RemoteMultimodalConfig

Source

pub struct RemoteMultimodalConfig {Show 34 fields
    pub include_html: bool,
    pub html_max_bytes: usize,
    pub include_url: bool,
    pub include_title: bool,
    pub include_screenshot: Option<bool>,
    pub temperature: f32,
    pub max_tokens: u16,
    pub request_json_object: bool,
    pub best_effort_json_extract: bool,
    pub reasoning_effort: Option<ReasoningEffort>,
    pub max_skills_per_round: usize,
    pub max_skill_context_chars: usize,
    pub max_rounds: usize,
    pub retry: RetryPolicy,
    pub capture_profiles: Vec<CaptureProfile>,
    pub model_policy: ModelPolicy,
    pub post_plan_wait_ms: u64,
    pub max_inflight_requests: Option<usize>,
    pub extra_ai_data: bool,
    pub extraction_prompt: Option<String>,
    pub extraction_schema: Option<ExtractionSchema>,
    pub screenshot: bool,
    pub tool_calling_mode: ToolCallingMode,
    pub html_diff_mode: HtmlDiffMode,
    pub planning_mode: Option<PlanningModeConfig>,
    pub synthesis_config: Option<SynthesisConfig>,
    pub confidence_strategy: Option<ConfidenceRetryStrategy>,
    pub self_healing: Option<SelfHealingConfig>,
    pub concurrent_execution: bool,
    pub relevance_gate: bool,
    pub relevance_prompt: Option<String>,
    pub url_prefilter: bool,
    pub url_prefilter_batch_size: usize,
    pub url_prefilter_max_tokens: u16,
}

Expand description

Runtime configuration for RemoteMultimodalEngine.

This struct controls:

what context is captured (URL/title/HTML),
how chat completion is requested (temperature/max tokens/JSON mode),
how long the engine loops and retries,
capture/model selection policies.

The engine should be able to export this config to users, and it should be safe to merge with user-provided prompts.

Fields§

§include_html: bool

Whether to include cleaned HTML in the model input.

§html_max_bytes: usize

Maximum number of bytes of cleaned HTML to include (global default).

A CaptureProfile may override this with its own html_max_bytes.

§include_url: bool

Whether to include the current URL in the model input.

§include_title: bool

Whether to include the current document title in the model input.

§include_screenshot: Option<bool>

Whether to include screenshots in the LLM request.

When None (default), automatically detects based on model name. Vision models (gpt-4o, claude-3, etc.) will receive screenshots, while text-only models will not.

Set to Some(true) to always include screenshots. Set to Some(false) to never include screenshots.

§temperature: f32

Sampling temperature used by the remote/local model.

§max_tokens: u16

Maximum tokens the model is allowed to generate for the plan.

§request_json_object: bool

If true, include response_format: {"type":"json_object"} in the request.

Some local servers ignore or reject this; disable if you see 400 errors.

§best_effort_json_extract: bool

Best-effort JSON extraction (strip fences / extract {...}).

§reasoning_effort: Option<ReasoningEffort>

Optional explicit reasoning effort for supported models/endpoints.

When set, outbound requests include reasoning: {"effort":"low|medium|high"}. Leave None to avoid sending provider-specific reasoning controls.

§max_skills_per_round: usize

Maximum number of skills to inject per round (default 3). Only the highest-priority matching skills are included.

§max_skill_context_chars: usize

Maximum characters for skill context injection per round (default 4000). Prevents large skill collections from bloating the system prompt.

§max_rounds: usize

Maximum number of plan/execute/re-capture rounds before giving up.

Each round is:

capture state
ask model for plan
execute steps
optionally wait
re-capture and decide whether complete

§retry: RetryPolicy

Retry policy for model output parsing failures and/or execution failures.

§capture_profiles: Vec<CaptureProfile>

Capture profiles to try across attempts.

If empty, the engine should build a sensible default list.

§model_policy: ModelPolicy

Model selection policy (small/medium/large).

The engine may choose a model size depending on constraints such as latency limits, cost tier, and whether retries are escalating.

§post_plan_wait_ms: u64

Optional: wait after executing a plan before re-capturing state (ms).

This is useful for pages that animate, load asynchronously, or perform challenge transitions after clicks.

§max_inflight_requests: Option<usize>

Maximum number of concurrent LLM HTTP requests for this engine instance. If None, no throttling is applied.

§extra_ai_data: bool

Enable extraction mode to return structured data from pages.

When enabled, the model is instructed to include an extracted field in its JSON response containing data extracted from the page.

§extraction_prompt: Option<String>

Optional custom extraction prompt appended to the system prompt.

Example: “Extract all product names and prices as a JSON array.”

§extraction_schema: Option<ExtractionSchema>

Optional JSON schema for structured extraction output.

When provided, the model is instructed to return the extracted field conforming to this schema. This enables type-safe extraction.

§screenshot: bool

Take a screenshot after automation completes and include it in results.

§tool_calling_mode: ToolCallingMode

Tool calling mode for structured action output.

JsonObject (default): Use JSON object mode
ToolCalling: Use OpenAI-compatible tool/function calling
Auto: Auto-select based on model capabilities

§html_diff_mode: HtmlDiffMode

HTML diff mode for condensed page state.

When enabled, sends only HTML changes after the first round, potentially reducing tokens by 50-70%.

§planning_mode: Option<PlanningModeConfig>

Planning mode configuration.

When enabled, allows the LLM to plan multiple steps upfront, reducing round-trips. Set to None to disable.

§synthesis_config: Option<SynthesisConfig>

Multi-page synthesis configuration.

When configured, enables analyzing multiple pages in a single LLM call. Set to None to disable.

§confidence_strategy: Option<ConfidenceRetryStrategy>

Confidence-based retry strategy.

When configured, uses confidence scores to make smarter retry decisions. Set to None for default retry behavior.

§self_healing: Option<SelfHealingConfig>

Self-healing configuration for automatic selector repair.

When enabled, failed selectors trigger an LLM call to diagnose and suggest alternatives. Set to None to disable.

§concurrent_execution: bool

Enable concurrent execution of independent actions.

When true, actions without dependencies can run in parallel using tokio::JoinSet.

§relevance_gate: bool

Enable relevance gating for crawled pages. When enabled, the LLM returns "relevant": true|false indicating whether the page is relevant to the crawl/extraction goals. Irrelevant pages can have their budget refunded.

§relevance_prompt: Option<String>

Optional custom relevance criteria prompt. When None, defaults to judging against extraction_prompt or general context.

§url_prefilter: bool

Enable URL-level pre-filtering before HTTP fetch. When enabled alongside relevance_gate, URLs are classified by the text model BEFORE fetching. Irrelevant URLs are skipped entirely.

§url_prefilter_batch_size: usize

Batch size for URL classification calls (default 20).

§url_prefilter_max_tokens: u16

Max tokens for URL classification response (default 200).

Struct RemoteMultimodalConfig Copy item path

Fields§

Implementations§

impl RemoteMultimodalConfig

pub fn new() -> Self

pub fn fast() -> Self

pub fn fast_with_planning() -> Self

pub fn is_extraction_only(&self) -> bool

pub fn with_html(self, include: bool) -> Self

pub fn with_html_max_bytes(self, bytes: usize) -> Self

pub fn with_temperature(self, temp: f32) -> Self

pub fn with_max_tokens(self, tokens: u16) -> Self

pub fn with_reasoning_effort(self, effort: Option<ReasoningEffort>) -> Self

pub fn with_max_rounds(self, rounds: usize) -> Self

pub fn with_retry(self, retry: RetryPolicy) -> Self

pub fn with_model_policy(self, policy: ModelPolicy) -> Self

pub fn with_extraction(self, enabled: bool) -> Self

pub fn with_extraction_prompt(self, prompt: impl Into<String>) -> Self

pub fn with_extraction_schema(self, schema: ExtractionSchema) -> Self

pub fn with_screenshot(self, enabled: bool) -> Self

pub fn with_include_screenshot(self, include: Option<bool>) -> Self

pub fn add_capture_profile(&mut self, profile: CaptureProfile)

pub fn with_tool_calling_mode(self, mode: ToolCallingMode) -> Self

pub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> Self

pub fn with_planning_mode(self, config: PlanningModeConfig) -> Self

pub fn with_synthesis_config(self, config: SynthesisConfig) -> Self

pub fn with_confidence_strategy(self, strategy: ConfidenceRetryStrategy) -> Self

pub fn with_self_healing(self, config: SelfHealingConfig) -> Self

pub fn with_concurrent_execution(self, enabled: bool) -> Self

pub fn with_relevance_gate(self, prompt: Option<String>) -> Self

pub fn with_url_prefilter(self, batch_size: Option<usize>) -> Self

Trait Implementations§

impl Clone for RemoteMultimodalConfig

fn clone(&self) -> RemoteMultimodalConfig

fn clone_from(&mut self, source: &Self)

impl Debug for RemoteMultimodalConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for RemoteMultimodalConfig

fn default() -> Self

impl<'de> Deserialize<'de> for RemoteMultimodalConfigwhere RemoteMultimodalConfig: Default,

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl PartialEq for RemoteMultimodalConfig

fn eq(&self, other: &RemoteMultimodalConfig) -> bool

fn ne(&self, other: &Rhs) -> bool

impl Serialize for RemoteMultimodalConfig

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

impl StructuralPartialEq for RemoteMultimodalConfig

Auto Trait Implementations§

impl Freeze for RemoteMultimodalConfig

impl RefUnwindSafe for RemoteMultimodalConfig

impl Send for RemoteMultimodalConfig

impl Sync for RemoteMultimodalConfig

impl Unpin for RemoteMultimodalConfig

impl UnwindSafe for RemoteMultimodalConfig

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<T> DeserializeOwned for Twhere T: for<'de> Deserialize<'de>,

Struct RemoteMultimodalConfig

impl<'de> Deserialize<'de> for RemoteMultimodalConfig
where RemoteMultimodalConfig: Default,

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,