pub struct RemoteMultimodalConfig {Show 34 fields
pub include_html: bool,
pub html_max_bytes: usize,
pub include_url: bool,
pub include_title: bool,
pub include_screenshot: Option<bool>,
pub temperature: f32,
pub max_tokens: u16,
pub request_json_object: bool,
pub best_effort_json_extract: bool,
pub reasoning_effort: Option<ReasoningEffort>,
pub max_skills_per_round: usize,
pub max_skill_context_chars: usize,
pub max_rounds: usize,
pub retry: RetryPolicy,
pub capture_profiles: Vec<CaptureProfile>,
pub model_policy: ModelPolicy,
pub post_plan_wait_ms: u64,
pub max_inflight_requests: Option<usize>,
pub extra_ai_data: bool,
pub extraction_prompt: Option<String>,
pub extraction_schema: Option<ExtractionSchema>,
pub screenshot: bool,
pub tool_calling_mode: ToolCallingMode,
pub html_diff_mode: HtmlDiffMode,
pub planning_mode: Option<PlanningModeConfig>,
pub synthesis_config: Option<SynthesisConfig>,
pub confidence_strategy: Option<ConfidenceRetryStrategy>,
pub self_healing: Option<SelfHealingConfig>,
pub concurrent_execution: bool,
pub relevance_gate: bool,
pub relevance_prompt: Option<String>,
pub url_prefilter: bool,
pub url_prefilter_batch_size: usize,
pub url_prefilter_max_tokens: u16,
}Expand description
Runtime configuration for RemoteMultimodalEngine.
This struct controls:
- what context is captured (URL/title/HTML),
- how chat completion is requested (temperature/max tokens/JSON mode),
- how long the engine loops and retries,
- capture/model selection policies.
The engine should be able to export this config to users, and it should be safe to merge with user-provided prompts.
Fields§
§include_html: boolWhether to include cleaned HTML in the model input.
html_max_bytes: usizeMaximum number of bytes of cleaned HTML to include (global default).
A CaptureProfile may override this with its own html_max_bytes.
include_url: boolWhether to include the current URL in the model input.
include_title: boolWhether to include the current document title in the model input.
include_screenshot: Option<bool>Whether to include screenshots in the LLM request.
When None (default), automatically detects based on model name.
Vision models (gpt-4o, claude-3, etc.) will receive screenshots,
while text-only models will not.
Set to Some(true) to always include screenshots.
Set to Some(false) to never include screenshots.
temperature: f32Sampling temperature used by the remote/local model.
max_tokens: u16Maximum tokens the model is allowed to generate for the plan.
request_json_object: boolIf true, include response_format: {"type":"json_object"} in the request.
Some local servers ignore or reject this; disable if you see 400 errors.
best_effort_json_extract: boolBest-effort JSON extraction (strip fences / extract {...}).
reasoning_effort: Option<ReasoningEffort>Optional explicit reasoning effort for supported models/endpoints.
When set, outbound requests include reasoning: {"effort":"low|medium|high"}.
Leave None to avoid sending provider-specific reasoning controls.
max_skills_per_round: usizeMaximum number of skills to inject per round (default 3). Only the highest-priority matching skills are included.
max_skill_context_chars: usizeMaximum characters for skill context injection per round (default 4000). Prevents large skill collections from bloating the system prompt.
max_rounds: usizeMaximum number of plan/execute/re-capture rounds before giving up.
Each round is:
- capture state
- ask model for plan
- execute steps
- optionally wait
- re-capture and decide whether complete
retry: RetryPolicyRetry policy for model output parsing failures and/or execution failures.
capture_profiles: Vec<CaptureProfile>Capture profiles to try across attempts.
If empty, the engine should build a sensible default list.
model_policy: ModelPolicyModel selection policy (small/medium/large).
The engine may choose a model size depending on constraints such as latency limits, cost tier, and whether retries are escalating.
post_plan_wait_ms: u64Optional: wait after executing a plan before re-capturing state (ms).
This is useful for pages that animate, load asynchronously, or perform challenge transitions after clicks.
max_inflight_requests: Option<usize>Maximum number of concurrent LLM HTTP requests for this engine instance.
If None, no throttling is applied.
extra_ai_data: boolEnable extraction mode to return structured data from pages.
When enabled, the model is instructed to include an extracted field
in its JSON response containing data extracted from the page.
extraction_prompt: Option<String>Optional custom extraction prompt appended to the system prompt.
Example: “Extract all product names and prices as a JSON array.”
extraction_schema: Option<ExtractionSchema>Optional JSON schema for structured extraction output.
When provided, the model is instructed to return the extracted field
conforming to this schema. This enables type-safe extraction.
screenshot: boolTake a screenshot after automation completes and include it in results.
tool_calling_mode: ToolCallingModeTool calling mode for structured action output.
JsonObject(default): Use JSON object modeToolCalling: Use OpenAI-compatible tool/function callingAuto: Auto-select based on model capabilities
html_diff_mode: HtmlDiffModeHTML diff mode for condensed page state.
When enabled, sends only HTML changes after the first round, potentially reducing tokens by 50-70%.
planning_mode: Option<PlanningModeConfig>Planning mode configuration.
When enabled, allows the LLM to plan multiple steps upfront,
reducing round-trips. Set to None to disable.
synthesis_config: Option<SynthesisConfig>Multi-page synthesis configuration.
When configured, enables analyzing multiple pages in a single
LLM call. Set to None to disable.
confidence_strategy: Option<ConfidenceRetryStrategy>Confidence-based retry strategy.
When configured, uses confidence scores to make smarter retry
decisions. Set to None for default retry behavior.
self_healing: Option<SelfHealingConfig>Self-healing configuration for automatic selector repair.
When enabled, failed selectors trigger an LLM call to diagnose
and suggest alternatives. Set to None to disable.
concurrent_execution: boolEnable concurrent execution of independent actions.
When true, actions without dependencies can run in parallel
using tokio::JoinSet.
relevance_gate: boolEnable relevance gating for crawled pages.
When enabled, the LLM returns "relevant": true|false indicating
whether the page is relevant to the crawl/extraction goals.
Irrelevant pages can have their budget refunded.
relevance_prompt: Option<String>Optional custom relevance criteria prompt. When None, defaults to judging against extraction_prompt or general context.
url_prefilter: boolEnable URL-level pre-filtering before HTTP fetch.
When enabled alongside relevance_gate, URLs are classified by the
text model BEFORE fetching. Irrelevant URLs are skipped entirely.
url_prefilter_batch_size: usizeBatch size for URL classification calls (default 20).
url_prefilter_max_tokens: u16Max tokens for URL classification response (default 200).
Implementations§
Source§impl RemoteMultimodalConfig
impl RemoteMultimodalConfig
Sourcepub fn new() -> RemoteMultimodalConfig
pub fn new() -> RemoteMultimodalConfig
Create a new config with default settings.
Sourcepub fn fast() -> RemoteMultimodalConfig
pub fn fast() -> RemoteMultimodalConfig
Create a config optimized for maximum speed and efficiency.
Enables all performance-positive features:
ToolCallingMode::Autofor reliable action parsingHtmlDiffMode::Autofor 50-70% token reductionConfidenceRetryStrategyfor smarter retriesconcurrent_executionfor parallel action execution
These features have zero or positive performance impact.
Sourcepub fn fast_with_planning() -> RemoteMultimodalConfig
pub fn fast_with_planning() -> RemoteMultimodalConfig
Create a config optimized for maximum speed with planning enabled.
Includes all fast() features plus:
PlanningModeConfigfor multi-step planning (fewer round-trips)SelfHealingConfigfor auto-repair of failed selectors
Best for complex multi-step automations.
Sourcepub fn is_extraction_only(&self) -> bool
pub fn is_extraction_only(&self) -> bool
Returns true when the config is set up for pure data extraction
(extraction enabled, single round). Used to auto-detect extraction-only
mode and optimize prompts / screenshot handling.
Sourcepub fn with_html(self, include: bool) -> RemoteMultimodalConfig
pub fn with_html(self, include: bool) -> RemoteMultimodalConfig
Set whether to include HTML.
Sourcepub fn with_html_max_bytes(self, bytes: usize) -> RemoteMultimodalConfig
pub fn with_html_max_bytes(self, bytes: usize) -> RemoteMultimodalConfig
Set maximum HTML bytes.
Sourcepub fn with_temperature(self, temp: f32) -> RemoteMultimodalConfig
pub fn with_temperature(self, temp: f32) -> RemoteMultimodalConfig
Set temperature.
Sourcepub fn with_max_tokens(self, tokens: u16) -> RemoteMultimodalConfig
pub fn with_max_tokens(self, tokens: u16) -> RemoteMultimodalConfig
Set max tokens.
Sourcepub fn with_reasoning_effort(
self,
effort: Option<ReasoningEffort>,
) -> RemoteMultimodalConfig
pub fn with_reasoning_effort( self, effort: Option<ReasoningEffort>, ) -> RemoteMultimodalConfig
Set explicit reasoning effort for supported models/endpoints.
Sourcepub fn with_max_rounds(self, rounds: usize) -> RemoteMultimodalConfig
pub fn with_max_rounds(self, rounds: usize) -> RemoteMultimodalConfig
Set max rounds.
Sourcepub fn with_retry(self, retry: RetryPolicy) -> RemoteMultimodalConfig
pub fn with_retry(self, retry: RetryPolicy) -> RemoteMultimodalConfig
Set retry policy.
Sourcepub fn with_model_policy(self, policy: ModelPolicy) -> RemoteMultimodalConfig
pub fn with_model_policy(self, policy: ModelPolicy) -> RemoteMultimodalConfig
Set model policy.
Sourcepub fn with_extraction(self, enabled: bool) -> RemoteMultimodalConfig
pub fn with_extraction(self, enabled: bool) -> RemoteMultimodalConfig
Enable extraction mode.
Sourcepub fn with_extraction_prompt(
self,
prompt: impl Into<String>,
) -> RemoteMultimodalConfig
pub fn with_extraction_prompt( self, prompt: impl Into<String>, ) -> RemoteMultimodalConfig
Set extraction prompt.
Sourcepub fn with_extraction_schema(
self,
schema: ExtractionSchema,
) -> RemoteMultimodalConfig
pub fn with_extraction_schema( self, schema: ExtractionSchema, ) -> RemoteMultimodalConfig
Set extraction schema.
Sourcepub fn with_screenshot(self, enabled: bool) -> RemoteMultimodalConfig
pub fn with_screenshot(self, enabled: bool) -> RemoteMultimodalConfig
Enable/disable screenshots.
Sourcepub fn with_include_screenshot(
self,
include: Option<bool>,
) -> RemoteMultimodalConfig
pub fn with_include_screenshot( self, include: Option<bool>, ) -> RemoteMultimodalConfig
Set whether to include screenshots in LLM requests.
Some(true): Always include screenshotsSome(false): Never include screenshotsNone: Auto-detect based on model name (default)
Sourcepub fn add_capture_profile(&mut self, profile: CaptureProfile)
pub fn add_capture_profile(&mut self, profile: CaptureProfile)
Add a capture profile.
Sourcepub fn with_tool_calling_mode(
self,
mode: ToolCallingMode,
) -> RemoteMultimodalConfig
pub fn with_tool_calling_mode( self, mode: ToolCallingMode, ) -> RemoteMultimodalConfig
Set tool calling mode.
Sourcepub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> RemoteMultimodalConfig
pub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> RemoteMultimodalConfig
Set HTML diff mode for condensed page state.
Sourcepub fn with_planning_mode(
self,
config: PlanningModeConfig,
) -> RemoteMultimodalConfig
pub fn with_planning_mode( self, config: PlanningModeConfig, ) -> RemoteMultimodalConfig
Enable planning mode with configuration.
Sourcepub fn with_synthesis_config(
self,
config: SynthesisConfig,
) -> RemoteMultimodalConfig
pub fn with_synthesis_config( self, config: SynthesisConfig, ) -> RemoteMultimodalConfig
Enable multi-page synthesis with configuration.
Sourcepub fn with_confidence_strategy(
self,
strategy: ConfidenceRetryStrategy,
) -> RemoteMultimodalConfig
pub fn with_confidence_strategy( self, strategy: ConfidenceRetryStrategy, ) -> RemoteMultimodalConfig
Set confidence-based retry strategy.
Sourcepub fn with_self_healing(
self,
config: SelfHealingConfig,
) -> RemoteMultimodalConfig
pub fn with_self_healing( self, config: SelfHealingConfig, ) -> RemoteMultimodalConfig
Enable self-healing with configuration.
Sourcepub fn with_concurrent_execution(self, enabled: bool) -> RemoteMultimodalConfig
pub fn with_concurrent_execution(self, enabled: bool) -> RemoteMultimodalConfig
Enable/disable concurrent execution of independent actions.
Sourcepub fn with_relevance_gate(
self,
prompt: Option<String>,
) -> RemoteMultimodalConfig
pub fn with_relevance_gate( self, prompt: Option<String>, ) -> RemoteMultimodalConfig
Enable relevance gating with optional custom criteria prompt.
Sourcepub fn with_url_prefilter(
self,
batch_size: Option<usize>,
) -> RemoteMultimodalConfig
pub fn with_url_prefilter( self, batch_size: Option<usize>, ) -> RemoteMultimodalConfig
Enable URL-level pre-filtering before HTTP fetch.
Requires relevance_gate to also be enabled.
Trait Implementations§
Source§impl Clone for RemoteMultimodalConfig
impl Clone for RemoteMultimodalConfig
Source§fn clone(&self) -> RemoteMultimodalConfig
fn clone(&self) -> RemoteMultimodalConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more