Expand description
Evaluating web stacks for agentic AI use.
Agents do not browse the web a human does; they talk to other services over
whatever wire format an LLM-native call graph rewards. That workload has its
own five axes — different from the vms axes (which score where code
runs) and different from the language/framework axes (which score what
agents build). This module scores the wire protocols and service
contracts an agent actually has to speak with:
- streaming — does the protocol carry LLM-shaped output (token streams, latents, mid-stream tool calls) as first-class frames, or is streaming a bolt-on on top of a document-oriented base?
- tool-discoverability — can an agent introspect the available capabilities (tool list, schemas, types) from the protocol itself, or must it read prose?
- encoding-efficiency — wire compactness for the LLM/tool-call workload (binary framing + content-typed payloads vs. JSON-over-HTTP/1.1 baseline).
- interop — does the agent ecosystem actually speak this? Network effect: the protocol every SDK already knows is worth more than the “objectively cleaner” one no one targets.
- security-primitives — does the protocol carry auth, distributed tracing, content integrity, and per-message identity natively, or are they someone-else’s-problem?
Profiles are curated 0.0–1.0 static judgments with evidence, like the
languages / frameworks /
vms profiles — deterministic, serializable, comparable.
Scores reflect each stack’s design center for agent-to-service traffic;
a great document-delivery protocol (HTTP+JSON, GraphQL) can rank low for
LLM-token streaming and high on interop, and that is the point.
use agentic_eval::web::{profile, rank_web_stacks, WebStack};
let spine = profile(WebStack::Spine);
assert!(spine.evidence.len() >= 3);
let ranked = rank_web_stacks();
assert!(ranked[0].fitness() >= ranked[ranked.len() - 1].fitness());Structs§
- WebStack
Comparison - Compare two stacks: positive deltas mean
afits agentic use better. - WebStack
Profile - A curated agentic profile of a web stack / wire protocol across the five agent-native axes, with evidence.
Enums§
- WebStack
- Web stacks / wire protocols with curated agentic profiles.
Functions§
- compare_
web_ stacks - Compare stack
aagainst baselinebacross all five axes. - profile
- The curated profile for
stack(static, documented judgments — see module docs). - profiles
- Profiles for all stacks, in
WebStack::allorder (deterministic). - rank_
web_ stacks - All profiles ranked best-first by
WebStackProfile::fitness(stable order on ties).