RequestParams

Struct RequestParams 

Source
pub struct RequestParams {
Show 55 fields pub url: Option<String>, pub request: Option<RequestType>, pub limit: Option<u32>, pub return_format: Option<ReturnFormatHandling>, pub country_code: Option<String>, pub tld: Option<bool>, pub depth: Option<u32>, pub cache: Option<bool>, pub scroll: Option<u32>, pub budget: Option<HashMap<String, u32>>, pub blacklist: Option<Vec<String>>, pub link_rewrite: Option<LinkRewriteRule>, pub whitelist: Option<Vec<String>>, pub locale: Option<String>, pub cookies: Option<String>, pub stealth: Option<bool>, pub headers: Option<HashMap<String, String>>, pub webhooks: Option<WebhookSettings>, pub metadata: Option<bool>, pub viewport: Option<Viewport>, pub encoding: Option<String>, pub subdomains: Option<bool>, pub user_agent: Option<String>, pub fingerprint: Option<bool>, pub storageless: Option<bool>, pub readability: Option<bool>, pub proxy_enabled: Option<bool>, pub respect_robots: Option<bool>, pub root_selector: Option<String>, pub full_resources: Option<bool>, pub text: Option<String>, pub sitemap: Option<bool>, pub external_domains: Option<Vec<String>>, pub return_embeddings: Option<bool>, pub return_headers: Option<bool>, pub return_page_links: Option<bool>, pub return_cookies: Option<bool>, pub request_timeout: Option<u8>, pub run_in_background: Option<bool>, pub skip_config_checks: Option<bool>, pub css_extraction_map: Option<CSSExtractionMap>, pub chunking_alg: Option<ChunkingAlgDict>, pub disable_intercept: Option<bool>, pub disable_hints: Option<bool>, pub wait_for: Option<WaitFor>, pub execution_scripts: Option<ExecutionScriptsMap>, pub automation_scripts: Option<WebAutomationMap>, pub redirect_policy: Option<RedirectPolicy>, pub event_tracker: Option<EventTracker>, pub crawl_timeout: Option<Timeout>, pub evaluate_on_new_document: Option<Box<String>>, pub lite_mode: Option<bool>, pub proxy: Option<ProxyType>, pub remote_proxy: Option<String>, pub max_credits_per_page: Option<f64>,
}
Expand description

Structure representing request parameters.

Fields§

§url: Option<String>

The URL to be crawled.

§request: Option<RequestType>

The type of request to be made.

§limit: Option<u32>

The maximum number of pages the crawler should visit.

§return_format: Option<ReturnFormatHandling>

The format in which the result should be returned.

§country_code: Option<String>

The country code for request

§tld: Option<bool>

Specifies whether to only visit the top-level domain.

§depth: Option<u32>

The depth of the crawl.

§cache: Option<bool>

Specifies whether the request should be cached.

§scroll: Option<u32>

Perform an infinite scroll on the page as new content arises. The request param also needs to be set to ‘chrome’ or ‘smart’.

§budget: Option<HashMap<String, u32>>

The budget for various resources.

§blacklist: Option<Vec<String>>

The blacklist routes to ignore. This can be a Regex string pattern.

§link_rewrite: Option<LinkRewriteRule>

URL rewrite rule applied to every discovered link.

§whitelist: Option<Vec<String>>

The whitelist routes to only crawl. This can be a Regex string pattern and used with black_listing.

§locale: Option<String>

The locale to be used during the crawl.

§cookies: Option<String>

The cookies to be set for the request, formatted as a single string.

§stealth: Option<bool>

Specifies whether to use stealth techniques to avoid detection.

§headers: Option<HashMap<String, String>>

The headers to be used for the request.

§webhooks: Option<WebhookSettings>

Specifies whether to send data via webhooks.

§metadata: Option<bool>

Specifies whether to include metadata in the response.

§viewport: Option<Viewport>

The dimensions of the viewport.

§encoding: Option<String>

The encoding to be used for the request.

§subdomains: Option<bool>

Specifies whether to include subdomains in the crawl.

§user_agent: Option<String>

The user agent string to be used for the request.

§fingerprint: Option<bool>

Specifies whether to use fingerprinting protection.

§storageless: Option<bool>

Specifies whether to perform the request without using storage.

§readability: Option<bool>

Specifies whether readability optimizations should be applied.

§proxy_enabled: Option<bool>

Specifies whether to use a proxy for the request. [Deprecated]: use the ‘proxy’ param instead.

§respect_robots: Option<bool>

Specifies whether to respect the site’s robots.txt file.

§root_selector: Option<String>

CSS selector to be used to filter the content.

§full_resources: Option<bool>

Specifies whether to load all resources of the crawl target.

§text: Option<String>

The text string to extract data from.

§sitemap: Option<bool>

Specifies whether to use the sitemap links.

§external_domains: Option<Vec<String>>

External domains to include the crawl.

§return_embeddings: Option<bool>

Returns the OpenAI embeddings for the title and description. Other values, such as keywords, may also be included. Requires the metadata parameter to be set to true.

§return_headers: Option<bool>

Returns the HTTP response headers.

§return_page_links: Option<bool>

Returns the link(s) found on the page that match the crawler query.

§return_cookies: Option<bool>

Returns the HTTP response cookies.

§request_timeout: Option<u8>

The timeout for the request, in seconds.

§run_in_background: Option<bool>

Specifies whether to run the request in the background.

§skip_config_checks: Option<bool>

Specifies whether to skip configuration checks.

§css_extraction_map: Option<CSSExtractionMap>

Use CSS query selectors to scrape contents from the web page. Set the paths and the CSS extraction object map to perform extractions per path or page.

§chunking_alg: Option<ChunkingAlgDict>

The chunking algorithm to use.

§disable_intercept: Option<bool>

Disable request interception when running ‘request’ as ‘chrome’ or ‘smart’. This can help when the page uses 3rd party or external scripts to load content.

§disable_hints: Option<bool>

Disables service-provided hints that add request optimizations to improve crawl outcomes, such as network blacklists, request-type selection, geo handling, and more.

§wait_for: Option<WaitFor>

The wait for events on the page. You need to make your request chrome or smart.

§execution_scripts: Option<ExecutionScriptsMap>

Perform custom Javascript tasks on a url or url path. You need to make your request chrome or smart

§automation_scripts: Option<WebAutomationMap>

Perform web automated tasks on a url or url path. You need to make your request chrome or smart

§redirect_policy: Option<RedirectPolicy>

The redirect policy for HTTP request. Set the value to Loose to allow all.

§event_tracker: Option<EventTracker>

Track the request sent and responses received for chrome or smart. The responses will track the bytes used and the requests will have the monotime sent.

§crawl_timeout: Option<Timeout>

The timeout to stop the crawl.

§evaluate_on_new_document: Option<Box<String>>

Evaluates given script in every frame upon creation (before loading frame’s scripts).

§lite_mode: Option<bool>

Runs the request using lite_mode:Lite mode reduces data transfer costs by 50%, with trade-offs in speed, accuracy, geo-targeting, and reliability. It’s best suited for non-urgent data collection or when targeting websites with minimal anti-bot protections.

§proxy: Option<ProxyType>

The proxy to use for request.

§remote_proxy: Option<String>

Use a remote proxy at ~50% reduced cost for file downloads. This requires a user-supplied static IP proxy endpoint.

§max_credits_per_page: Option<f64>

Set the maximum number of credits to use per page. Credits are measured in decimal units, where 10,000 credits equal one dollar (100 credits per penny). Credit limiting only applies to request that are Javascript rendered using smart_mode or chrome for the ‘request’ type.

Trait Implementations§

Source§

impl Clone for RequestParams

Source§

fn clone(&self) -> RequestParams

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for RequestParams

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for RequestParams

Source§

fn default() -> RequestParams

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for RequestParams

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for RequestParams

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> PolicyExt for T
where T: ?Sized,

Source§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more
Source§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,