Struct spider_client::RequestParams
source · pub struct RequestParams {Show 42 fields
pub url: Option<String>,
pub request: Option<RequestType>,
pub limit: Option<u32>,
pub return_format: Option<ReturnFormat>,
pub tld: Option<bool>,
pub depth: Option<u32>,
pub cache: Option<bool>,
pub budget: Option<HashMap<String, u32>>,
pub blacklist: Option<Vec<String>>,
pub whitelist: Option<Vec<String>>,
pub locale: Option<String>,
pub cookies: Option<String>,
pub stealth: Option<bool>,
pub headers: Option<HashMap<String, String>>,
pub anti_bot: Option<bool>,
pub metadata: Option<bool>,
pub viewport: Option<Viewport>,
pub encoding: Option<String>,
pub subdomains: Option<bool>,
pub user_agent: Option<String>,
pub store_data: Option<bool>,
pub gpt_config: Option<HashMap<String, String>>,
pub fingerprint: Option<bool>,
pub storageless: Option<bool>,
pub readability: Option<bool>,
pub proxy_enabled: Option<bool>,
pub respect_robots: Option<bool>,
pub root_selector: Option<String>,
pub full_resources: Option<bool>,
pub website_limit: Option<u32>,
pub text: Option<String>,
pub sitemap: Option<bool>,
pub page_insights: Option<bool>,
pub return_embeddings: Option<bool>,
pub request_timeout: Option<u8>,
pub run_in_background: Option<bool>,
pub skip_config_checks: Option<bool>,
pub chunking_alg: Option<ChunkingAlgDict>,
pub clean: Option<bool>,
pub clean_full: Option<bool>,
pub disable_intercept: Option<bool>,
pub wait_for: Option<WaitFor>,
}Expand description
Structure representing request parameters.
Fields§
§url: Option<String>The URL to be crawled.
request: Option<RequestType>The type of request to be made.
limit: Option<u32>The maximum number of pages the crawler should visit.
return_format: Option<ReturnFormat>The format in which the result should be returned.
tld: Option<bool>Specifies whether to only visit the top-level domain.
depth: Option<u32>The depth of the crawl.
cache: Option<bool>Specifies whether the request should be cached.
budget: Option<HashMap<String, u32>>The budget for various resources.
blacklist: Option<Vec<String>>The blacklist routes to ignore. This can be a Regex string pattern.
whitelist: Option<Vec<String>>The whitelist routes to only crawl. This can be a Regex string pattern and used with black_listing.
locale: Option<String>The locale to be used during the crawl.
The cookies to be set for the request, formatted as a single string.
stealth: Option<bool>Specifies whether to use stealth techniques to avoid detection.
headers: Option<HashMap<String, String>>The headers to be used for the request.
anti_bot: Option<bool>Specifies whether anti-bot measures should be used.
metadata: Option<bool>Specifies whether to include metadata in the response.
viewport: Option<Viewport>The dimensions of the viewport.
encoding: Option<String>The encoding to be used for the request.
subdomains: Option<bool>Specifies whether to include subdomains in the crawl.
user_agent: Option<String>The user agent string to be used for the request.
store_data: Option<bool>Specifies whether the response data should be stored.
gpt_config: Option<HashMap<String, String>>Configuration settings for GPT (general purpose texture mappings).
fingerprint: Option<bool>Specifies whether to use fingerprinting protection.
storageless: Option<bool>Specifies whether to perform the request without using storage.
readability: Option<bool>Specifies whether readability optimizations should be applied.
proxy_enabled: Option<bool>Specifies whether to use a proxy for the request.
respect_robots: Option<bool>Specifies whether to respect the site’s robots.txt file.
root_selector: Option<String>CSS selector to be used to filter the content.
full_resources: Option<bool>Specifies whether to load all resources of the crawl target.
website_limit: Option<u32>The websites limit if a list is sent from text or urls comma split. This helps automatic configuration of the system.
text: Option<String>The text string to extract data from.
sitemap: Option<bool>Specifies whether to use the sitemap links.
page_insights: Option<bool>Get page insights to determine information like request duration, accessibility, and other web vitals. Requires the metadata parameter to be set to true.
return_embeddings: Option<bool>Returns the OpenAI embeddings for the title and description. Other values, such as keywords, may also be included. Requires the metadata parameter to be set to true.
request_timeout: Option<u8>The timeout for the request, in milliseconds.
run_in_background: Option<bool>Specifies whether to run the request in the background.
skip_config_checks: Option<bool>Specifies whether to skip configuration checks.
chunking_alg: Option<ChunkingAlgDict>The chunking algorithm to use.
clean: Option<bool>Clean the markdown or text for AI.
clean_full: Option<bool>Clean the markdown or text for AI removing footers, navigation, and more.
disable_intercept: Option<bool>Disable request interception when running ‘request’ as ‘chrome’ or ‘smart’. This can help when the page uses 3rd party or external scripts to load content.
wait_for: Option<WaitFor>The wait for events on the page. You need to make your request chrome or smart.
Trait Implementations§
source§impl Clone for RequestParams
impl Clone for RequestParams
source§fn clone(&self) -> RequestParams
fn clone(&self) -> RequestParams
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moresource§impl Debug for RequestParams
impl Debug for RequestParams
source§impl Default for RequestParams
impl Default for RequestParams
source§fn default() -> RequestParams
fn default() -> RequestParams
source§impl<'de> Deserialize<'de> for RequestParams
impl<'de> Deserialize<'de> for RequestParams
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for RequestParams
impl RefUnwindSafe for RequestParams
impl Send for RequestParams
impl Sync for RequestParams
impl Unpin for RequestParams
impl UnwindSafe for RequestParams
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit)