Struct RuntimeMetrics

Source

pub struct RuntimeMetrics {
    pub active_sessions: AtomicUsize,
    pub total_sessions: AtomicU64,
    pub total_steps: AtomicU64,
    pub total_tool_calls: AtomicU64,
    pub failed_tool_calls: AtomicU64,
    pub backpressure_shed_count: AtomicU64,
    pub memory_recall_count: AtomicU64,
    pub checkpoint_errors: AtomicU64,
    pub step_latency: LatencyHistogram,
    /* private fields */
}

Expand description

Shared runtime metrics. Clone the Arc to share across threads.

Fields§

§active_sessions: AtomicUsize

Number of agent sessions currently in progress.

§total_sessions: AtomicU64

Total number of sessions started since the runtime was created.

§total_steps: AtomicU64

Total number of ReAct steps executed across all sessions.

§total_tool_calls: AtomicU64

Total number of tool calls dispatched (across all tool names).

§failed_tool_calls: AtomicU64

Total number of tool calls that returned an error observation.

§backpressure_shed_count: AtomicU64

Total number of requests shed due to backpressure.

§memory_recall_count: AtomicU64

Total number of memory recall operations.

§checkpoint_errors: AtomicU64

Total number of checkpoint failures encountered during run_agent.

§step_latency: LatencyHistogram

Per-step latency histogram.

Implementations§

Source §

impl RuntimeMetrics

Source

pub fn new() -> Arc<Self>

Allocate a new RuntimeMetrics instance wrapped in an Arc.

Source

pub fn active_sessions(&self) -> usize

Return the number of agent sessions currently in progress.

Source

pub fn total_sessions(&self) -> u64

Return the total number of sessions started since the runtime was created.

Source

pub fn avg_tool_calls_per_session(&self) -> f64

Return the average number of tool calls per completed session.

Returns 0.0 when no sessions have been recorded.

Source

pub fn total_steps(&self) -> u64

Return the total number of ReAct steps executed across all sessions.

Source

pub fn avg_steps_per_session(&self) -> f64

Return the average number of ReAct steps per completed session.

Returns 0.0 when no sessions have been recorded.

Source

pub fn total_tool_calls(&self) -> u64

Return the total number of tool calls dispatched.

Source

pub fn failed_tool_calls(&self) -> u64

Return the total number of tool calls that returned an error observation.

Source

pub fn tool_success_rate(&self) -> f64

Return the fraction of tool calls that succeeded (i.e. did not fail).

Returns 1.0 if no tool calls have been recorded yet (vacuously all succeeded) and a value in [0.0, 1.0] once calls have been made.

Source

pub fn backpressure_shed_count(&self) -> u64

Return the total number of requests shed due to backpressure.

Source

pub fn memory_recall_count(&self) -> u64

Return the total number of memory recall operations performed.

Source

pub fn checkpoint_errors(&self) -> u64

Return the total number of checkpoint failures encountered during run_agent.

Source

pub fn checkpoint_error_rate(&self) -> f64

Return the ratio of checkpoint errors to total completed sessions.

Returns 0.0 when no sessions have been recorded.

Source

pub fn p50_latency_ms(&self) -> u64

Return the median (50th-percentile) step latency in milliseconds.

Convenience shorthand for self.step_latency.p50(). Returns 0 when no step latencies have been recorded.

Source

pub fn record_tool_call(&self, tool_name: &str)

Increment the call counter for tool_name by 1.

Called automatically by the agent loop when with_metrics is configured.

Source

pub fn record_tool_failure(&self, tool_name: &str)

Increment the failure counter for tool_name by 1.

Called automatically by the agent loop when a tool returns an error.

Source

pub fn per_tool_calls_snapshot(&self) -> HashMap<String, u64>

Return a snapshot of per-tool call counts as a HashMap<tool_name, count>.

Source

pub fn per_tool_failures_snapshot(&self) -> HashMap<String, u64>

Return a snapshot of per-tool failure counts as a HashMap<tool_name, count>.

Source

pub fn record_agent_tool_call(&self, agent_id: &str, tool_name: &str)

Increment call counter for (agent_id, tool_name).

Source

pub fn record_agent_tool_failure(&self, agent_id: &str, tool_name: &str)

Increment failure counter for (agent_id, tool_name).

Source

pub fn per_agent_tool_calls_snapshot( &self, ) -> HashMap<String, HashMap<String, u64>>

Snapshot of per-agent, per-tool call counts.

Source

pub fn per_agent_tool_failures_snapshot( &self, ) -> HashMap<String, HashMap<String, u64>>

Snapshot of per-agent, per-tool failure counts.

Source

pub fn snapshot(&self) -> MetricsSnapshot

Capture a complete snapshot of all counters, including per-tool breakdowns.

This is the preferred alternative to to_snapshot — it returns a named MetricsSnapshot struct instead of an opaque tuple.

Source

pub fn record_step_latency(&self, ms: u64)

Record a step latency sample.

Source

pub fn reset(&self)

Reset all counters to zero.

Intended for testing. In production, counters are monotonically increasing.

Source

pub fn failure_rate(&self) -> f64

Return the fraction of tool calls that failed: failed / total.

Returns 0.0 if no tool calls have been recorded.

Source

pub fn success_rate(&self) -> f64

Return the fraction of tool calls that succeeded: 1.0 - failure_rate().

Returns 1.0 if no tool calls have been recorded (vacuously all succeeded).

Source

pub fn is_active(&self) -> bool

Return true if there is at least one active (in-progress) session.

Source

pub fn step_latency_p50(&self) -> u64

Return the 50th-percentile (median) step latency in milliseconds.

Delegates to LatencyHistogram::p50 on the histogram tracked by this RuntimeMetrics instance. Returns 0 if no steps have been recorded.

Source

pub fn step_latency_p99(&self) -> u64

Return the 99th-percentile step latency in milliseconds.

Delegates to LatencyHistogram::p99. Returns 0 if no steps have been recorded.

Source

pub fn top_tools_by_calls(&self, n: usize) -> Vec<(String, u64)>

Return the top n tools by total call count, sorted descending.

Returns fewer than n entries if fewer tools have been called.

Source

pub fn top_tools_by_failures(&self, n: usize) -> Vec<(String, u64)>

Return the top n tools by total failure count, sorted descending.

Analogous to top_tools_by_calls; returns fewer than n entries if fewer tools have recorded failures.

Source

pub fn total_step_latency_ms(&self) -> u64

Return the sum of all recorded step latencies in milliseconds.

Source

pub fn to_snapshot(&self) -> (usize, u64, u64, u64, u64, u64, u64)

👎Deprecated since 1.0.3:

use snapshot() which returns the named MetricsSnapshot struct

Capture a snapshot of global counters as plain integers.

Returns (active_sessions, total_sessions, total_steps, total_tool_calls, failed_tool_calls, backpressure_shed_count, memory_recall_count). For per-tool breakdowns use per_tool_calls_snapshot and per_tool_failures_snapshot.

§Deprecation

Prefer snapshot which returns the named MetricsSnapshot struct and includes per-tool, per-agent, and histogram data. This method returns an anonymous tuple whose field order is easy to misread.