pub struct RemoteMultimodalConfig {Show 34 fields
pub include_html: bool,
pub html_max_bytes: usize,
pub include_url: bool,
pub include_title: bool,
pub include_screenshot: Option<bool>,
pub temperature: f32,
pub max_tokens: u16,
pub request_json_object: bool,
pub best_effort_json_extract: bool,
pub reasoning_effort: Option<ReasoningEffort>,
pub max_skills_per_round: usize,
pub max_skill_context_chars: usize,
pub max_rounds: usize,
pub retry: RetryPolicy,
pub capture_profiles: Vec<CaptureProfile>,
pub model_policy: ModelPolicy,
pub post_plan_wait_ms: u64,
pub max_inflight_requests: Option<usize>,
pub extra_ai_data: bool,
pub extraction_prompt: Option<String>,
pub extraction_schema: Option<ExtractionSchema>,
pub screenshot: bool,
pub tool_calling_mode: ToolCallingMode,
pub html_diff_mode: HtmlDiffMode,
pub planning_mode: Option<PlanningModeConfig>,
pub synthesis_config: Option<SynthesisConfig>,
pub confidence_strategy: Option<ConfidenceRetryStrategy>,
pub self_healing: Option<SelfHealingConfig>,
pub concurrent_execution: bool,
pub relevance_gate: bool,
pub relevance_prompt: Option<String>,
pub url_prefilter: bool,
pub url_prefilter_batch_size: usize,
pub url_prefilter_max_tokens: u16,
}Expand description
Runtime configuration for RemoteMultimodalEngine.
This struct controls:
- what context is captured (URL/title/HTML),
- how chat completion is requested (temperature/max tokens/JSON mode),
- how long the engine loops and retries,
- capture/model selection policies.
The engine should be able to export this config to users, and it should be safe to merge with user-provided prompts.
Fields§
§include_html: boolWhether to include cleaned HTML in the model input.
html_max_bytes: usizeMaximum number of bytes of cleaned HTML to include (global default).
A CaptureProfile may override this with its own html_max_bytes.
include_url: boolWhether to include the current URL in the model input.
include_title: boolWhether to include the current document title in the model input.
include_screenshot: Option<bool>Whether to include screenshots in the LLM request.
When None (default), automatically detects based on model name.
Vision models (gpt-4o, claude-3, etc.) will receive screenshots,
while text-only models will not.
Set to Some(true) to always include screenshots.
Set to Some(false) to never include screenshots.
temperature: f32Sampling temperature used by the remote/local model.
max_tokens: u16Maximum tokens the model is allowed to generate for the plan.
request_json_object: boolIf true, include response_format: {"type":"json_object"} in the request.
Some local servers ignore or reject this; disable if you see 400 errors.
best_effort_json_extract: boolBest-effort JSON extraction (strip fences / extract {...}).
reasoning_effort: Option<ReasoningEffort>Optional explicit reasoning effort for supported models/endpoints.
When set, outbound requests include reasoning: {"effort":"low|medium|high"}.
Leave None to avoid sending provider-specific reasoning controls.
max_skills_per_round: usizeMaximum number of skills to inject per round (default 3). Only the highest-priority matching skills are included.
max_skill_context_chars: usizeMaximum characters for skill context injection per round (default 4000). Prevents large skill collections from bloating the system prompt.
max_rounds: usizeMaximum number of plan/execute/re-capture rounds before giving up.
Each round is:
- capture state
- ask model for plan
- execute steps
- optionally wait
- re-capture and decide whether complete
retry: RetryPolicyRetry policy for model output parsing failures and/or execution failures.
capture_profiles: Vec<CaptureProfile>Capture profiles to try across attempts.
If empty, the engine should build a sensible default list.
model_policy: ModelPolicyModel selection policy (small/medium/large).
The engine may choose a model size depending on constraints such as latency limits, cost tier, and whether retries are escalating.
post_plan_wait_ms: u64Optional: wait after executing a plan before re-capturing state (ms).
This is useful for pages that animate, load asynchronously, or perform challenge transitions after clicks.
max_inflight_requests: Option<usize>Maximum number of concurrent LLM HTTP requests for this engine instance.
If None, no throttling is applied.
extra_ai_data: boolEnable extraction mode to return structured data from pages.
When enabled, the model is instructed to include an extracted field
in its JSON response containing data extracted from the page.
extraction_prompt: Option<String>Optional custom extraction prompt appended to the system prompt.
Example: “Extract all product names and prices as a JSON array.”
extraction_schema: Option<ExtractionSchema>Optional JSON schema for structured extraction output.
When provided, the model is instructed to return the extracted field
conforming to this schema. This enables type-safe extraction.
screenshot: boolTake a screenshot after automation completes and include it in results.
tool_calling_mode: ToolCallingModeTool calling mode for structured action output.
JsonObject(default): Use JSON object modeToolCalling: Use OpenAI-compatible tool/function callingAuto: Auto-select based on model capabilities
html_diff_mode: HtmlDiffModeHTML diff mode for condensed page state.
When enabled, sends only HTML changes after the first round, potentially reducing tokens by 50-70%.
planning_mode: Option<PlanningModeConfig>Planning mode configuration.
When enabled, allows the LLM to plan multiple steps upfront,
reducing round-trips. Set to None to disable.
synthesis_config: Option<SynthesisConfig>Multi-page synthesis configuration.
When configured, enables analyzing multiple pages in a single
LLM call. Set to None to disable.
confidence_strategy: Option<ConfidenceRetryStrategy>Confidence-based retry strategy.
When configured, uses confidence scores to make smarter retry
decisions. Set to None for default retry behavior.
self_healing: Option<SelfHealingConfig>Self-healing configuration for automatic selector repair.
When enabled, failed selectors trigger an LLM call to diagnose
and suggest alternatives. Set to None to disable.
concurrent_execution: boolEnable concurrent execution of independent actions.
When true, actions without dependencies can run in parallel
using tokio::JoinSet.
relevance_gate: boolEnable relevance gating for crawled pages.
When enabled, the LLM returns "relevant": true|false indicating
whether the page is relevant to the crawl/extraction goals.
Irrelevant pages can have their budget refunded.
relevance_prompt: Option<String>Optional custom relevance criteria prompt. When None, defaults to judging against extraction_prompt or general context.
url_prefilter: boolEnable URL-level pre-filtering before HTTP fetch.
When enabled alongside relevance_gate, URLs are classified by the
text model BEFORE fetching. Irrelevant URLs are skipped entirely.
url_prefilter_batch_size: usizeBatch size for URL classification calls (default 20).
url_prefilter_max_tokens: u16Max tokens for URL classification response (default 200).
Implementations§
Source§impl RemoteMultimodalConfig
impl RemoteMultimodalConfig
Sourcepub fn fast() -> Self
pub fn fast() -> Self
Create a config optimized for maximum speed and efficiency.
Enables all performance-positive features:
ToolCallingMode::Autofor reliable action parsingHtmlDiffMode::Autofor 50-70% token reductionConfidenceRetryStrategyfor smarter retriesconcurrent_executionfor parallel action execution
These features have zero or positive performance impact.
Sourcepub fn fast_with_planning() -> Self
pub fn fast_with_planning() -> Self
Create a config optimized for maximum speed with planning enabled.
Includes all fast() features plus:
PlanningModeConfigfor multi-step planning (fewer round-trips)SelfHealingConfigfor auto-repair of failed selectors
Best for complex multi-step automations.
Sourcepub fn is_extraction_only(&self) -> bool
pub fn is_extraction_only(&self) -> bool
Returns true when the config is set up for pure data extraction
(extraction enabled, single round). Used to auto-detect extraction-only
mode and optimize prompts / screenshot handling.
Sourcepub fn with_html_max_bytes(self, bytes: usize) -> Self
pub fn with_html_max_bytes(self, bytes: usize) -> Self
Set maximum HTML bytes.
Sourcepub fn with_temperature(self, temp: f32) -> Self
pub fn with_temperature(self, temp: f32) -> Self
Set temperature.
Sourcepub fn with_max_tokens(self, tokens: u16) -> Self
pub fn with_max_tokens(self, tokens: u16) -> Self
Set max tokens.
Sourcepub fn with_reasoning_effort(self, effort: Option<ReasoningEffort>) -> Self
pub fn with_reasoning_effort(self, effort: Option<ReasoningEffort>) -> Self
Set explicit reasoning effort for supported models/endpoints.
Sourcepub fn with_max_rounds(self, rounds: usize) -> Self
pub fn with_max_rounds(self, rounds: usize) -> Self
Set max rounds.
Sourcepub fn with_retry(self, retry: RetryPolicy) -> Self
pub fn with_retry(self, retry: RetryPolicy) -> Self
Set retry policy.
Sourcepub fn with_model_policy(self, policy: ModelPolicy) -> Self
pub fn with_model_policy(self, policy: ModelPolicy) -> Self
Set model policy.
Sourcepub fn with_extraction(self, enabled: bool) -> Self
pub fn with_extraction(self, enabled: bool) -> Self
Enable extraction mode.
Sourcepub fn with_extraction_prompt(self, prompt: impl Into<String>) -> Self
pub fn with_extraction_prompt(self, prompt: impl Into<String>) -> Self
Set extraction prompt.
Sourcepub fn with_extraction_schema(self, schema: ExtractionSchema) -> Self
pub fn with_extraction_schema(self, schema: ExtractionSchema) -> Self
Set extraction schema.
Sourcepub fn with_screenshot(self, enabled: bool) -> Self
pub fn with_screenshot(self, enabled: bool) -> Self
Enable/disable screenshots.
Sourcepub fn with_include_screenshot(self, include: Option<bool>) -> Self
pub fn with_include_screenshot(self, include: Option<bool>) -> Self
Set whether to include screenshots in LLM requests.
Some(true): Always include screenshotsSome(false): Never include screenshotsNone: Auto-detect based on model name (default)
Sourcepub fn add_capture_profile(&mut self, profile: CaptureProfile)
pub fn add_capture_profile(&mut self, profile: CaptureProfile)
Add a capture profile.
Sourcepub fn with_tool_calling_mode(self, mode: ToolCallingMode) -> Self
pub fn with_tool_calling_mode(self, mode: ToolCallingMode) -> Self
Set tool calling mode.
Sourcepub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> Self
pub fn with_html_diff_mode(self, mode: HtmlDiffMode) -> Self
Set HTML diff mode for condensed page state.
Sourcepub fn with_planning_mode(self, config: PlanningModeConfig) -> Self
pub fn with_planning_mode(self, config: PlanningModeConfig) -> Self
Enable planning mode with configuration.
Sourcepub fn with_synthesis_config(self, config: SynthesisConfig) -> Self
pub fn with_synthesis_config(self, config: SynthesisConfig) -> Self
Enable multi-page synthesis with configuration.
Sourcepub fn with_confidence_strategy(self, strategy: ConfidenceRetryStrategy) -> Self
pub fn with_confidence_strategy(self, strategy: ConfidenceRetryStrategy) -> Self
Set confidence-based retry strategy.
Sourcepub fn with_self_healing(self, config: SelfHealingConfig) -> Self
pub fn with_self_healing(self, config: SelfHealingConfig) -> Self
Enable self-healing with configuration.
Sourcepub fn with_concurrent_execution(self, enabled: bool) -> Self
pub fn with_concurrent_execution(self, enabled: bool) -> Self
Enable/disable concurrent execution of independent actions.
Sourcepub fn with_relevance_gate(self, prompt: Option<String>) -> Self
pub fn with_relevance_gate(self, prompt: Option<String>) -> Self
Enable relevance gating with optional custom criteria prompt.
Sourcepub fn with_url_prefilter(self, batch_size: Option<usize>) -> Self
pub fn with_url_prefilter(self, batch_size: Option<usize>) -> Self
Enable URL-level pre-filtering before HTTP fetch.
Requires relevance_gate to also be enabled.
Trait Implementations§
Source§impl Clone for RemoteMultimodalConfig
impl Clone for RemoteMultimodalConfig
Source§fn clone(&self) -> RemoteMultimodalConfig
fn clone(&self) -> RemoteMultimodalConfig
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more