Skip to main content

CrawlStats

Struct CrawlStats 

Source
pub struct CrawlStats {
Show 21 fields pub requests_count: u64, pub concurrent_requests: u32, pub concurrent_requests_per_domain: u32, pub failed_requests_count: u64, pub offsite_requests_count: u64, pub robots_disallowed_count: u64, pub cache_hits: u64, pub cache_misses: u64, pub response_bytes: u64, pub items_scraped: u64, pub items_dropped: u64, pub start_time: f64, pub end_time: f64, pub download_delay: f64, pub blocked_requests_count: u64, pub custom_stats: HashMap<String, Value>, pub response_status_count: HashMap<String, u64>, pub domains_response_bytes: HashMap<String, u64>, pub sessions_requests_count: HashMap<String, u64>, pub proxies: Vec<String>, pub log_levels_counter: HashMap<String, u64>,
}
Expand description

Aggregate statistics collected during a crawl run.

The crawler engine populates this struct as it processes requests. After the crawl finishes, you can inspect it via CrawlerEngine::stats or from the returned CrawlResult. All counters start at zero and are incremented atomically during the crawl loop.

Fields§

§requests_count: u64

Total number of requests dispatched.

§concurrent_requests: u32

Maximum number of concurrent requests allowed.

§concurrent_requests_per_domain: u32

Maximum number of concurrent requests per domain.

§failed_requests_count: u64

Number of requests that failed with an error.

§offsite_requests_count: u64

Number of requests rejected because their domain was not allowed.

§robots_disallowed_count: u64

Number of requests blocked by robots.txt rules.

§cache_hits: u64

Number of responses served from the cache.

§cache_misses: u64

Number of responses that were not found in the cache.

§response_bytes: u64

Total bytes received across all responses.

§items_scraped: u64

Number of items successfully scraped.

§items_dropped: u64

Number of items dropped by the item pipeline.

§start_time: f64

Unix timestamp when the crawl started.

§end_time: f64

Unix timestamp when the crawl ended.

§download_delay: f64

Configured delay in seconds between consecutive requests.

§blocked_requests_count: u64

Number of requests that received a blocked status code.

§custom_stats: HashMap<String, Value>

User-defined custom statistics.

§response_status_count: HashMap<String, u64>

Count of responses grouped by HTTP status code.

§domains_response_bytes: HashMap<String, u64>

Total bytes received grouped by domain.

§sessions_requests_count: HashMap<String, u64>

Number of requests dispatched per session.

§proxies: Vec<String>

List of proxy addresses used during the crawl.

§log_levels_counter: HashMap<String, u64>

Count of log messages grouped by level.

Implementations§

Source§

impl CrawlStats

Source

pub fn elapsed_seconds(&self) -> f64

Returns the wall-clock duration of the crawl in seconds, computed as end_time - start_time. Both timestamps are Unix epoch seconds recorded at the start and end of CrawlerEngine::crawl.

Source

pub fn requests_per_second(&self) -> f64

Returns the average number of requests completed per second over the entire crawl. Returns 0.0 if the crawl duration was zero (e.g., an instant crawl with no network calls).

Source

pub fn increment_status(&mut self, status: u16)

Increments the counter for the given HTTP status code. Status codes are stored under keys like "status_200" or "status_404" in the response_status_count map, making it easy to spot error patterns.

Source

pub fn increment_response_bytes(&mut self, domain: &str, count: u64)

Adds count bytes to both the global response_bytes total and the per-domain counter in domains_response_bytes. This is called by the engine after every successful fetch so you can identify bandwidth-heavy domains.

Source

pub fn increment_requests_count(&mut self, sid: &str)

Increments the total requests_count and the per-session counter in sessions_requests_count. The engine calls this before every fetch so you can see how load is distributed across sessions.

Trait Implementations§

Source§

impl Clone for CrawlStats

Source§

fn clone(&self) -> CrawlStats

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for CrawlStats

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for CrawlStats

Source§

fn default() -> CrawlStats

Returns the “default value” for a type. Read more
Source§

impl<'de> Deserialize<'de> for CrawlStats

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Serialize for CrawlStats

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,