Skip to main content

Module web

Module web 

Source
Expand description

Evaluating web stacks for agentic AI use.

Agents do not browse the web a human does; they talk to other services over whatever wire format an LLM-native call graph rewards. That workload has its own five axes — different from the vms axes (which score where code runs) and different from the language/framework axes (which score what agents build). This module scores the wire protocols and service contracts an agent actually has to speak with:

  • streaming — does the protocol carry LLM-shaped output (token streams, latents, mid-stream tool calls) as first-class frames, or is streaming a bolt-on on top of a document-oriented base?
  • tool-discoverability — can an agent introspect the available capabilities (tool list, schemas, types) from the protocol itself, or must it read prose?
  • encoding-efficiency — wire compactness for the LLM/tool-call workload (binary framing + content-typed payloads vs. JSON-over-HTTP/1.1 baseline).
  • interop — does the agent ecosystem actually speak this? Network effect: the protocol every SDK already knows is worth more than the “objectively cleaner” one no one targets.
  • security-primitives — does the protocol carry auth, distributed tracing, content integrity, and per-message identity natively, or are they someone-else’s-problem?

Profiles are curated 0.0–1.0 static judgments with evidence, like the languages / frameworks / vms profiles — deterministic, serializable, comparable. Scores reflect each stack’s design center for agent-to-service traffic; a great document-delivery protocol (HTTP+JSON, GraphQL) can rank low for LLM-token streaming and high on interop, and that is the point.

use agentic_eval::web::{profile, rank_web_stacks, WebStack};
let spine = profile(WebStack::Spine);
assert!(spine.evidence.len() >= 3);
let ranked = rank_web_stacks();
assert!(ranked[0].fitness() >= ranked[ranked.len() - 1].fitness());

Structs§

WebStackComparison
Compare two stacks: positive deltas mean a fits agentic use better.
WebStackProfile
A curated agentic profile of a web stack / wire protocol across the five agent-native axes, with evidence.

Enums§

WebStack
Web stacks / wire protocols with curated agentic profiles.

Functions§

compare_web_stacks
Compare stack a against baseline b across all five axes.
profile
The curated profile for stack (static, documented judgments — see module docs).
profiles
Profiles for all stacks, in WebStack::all order (deterministic).
rank_web_stacks
All profiles ranked best-first by WebStackProfile::fitness (stable order on ties).