agentic-eval 0.14.2

Evaluate programs, CLI commands, programming languages, AI frameworks, and VM/sandbox systems for agentic AI use across four axes — token efficiency, determinism, reliability, and safety — under popular tokenizers (OpenAI GPT-4/GPT-4o, Anthropic Claude). Includes a CLI effect classifier, curated language/framework/VM profiles, and a self-describing ontology.
Documentation

Builds

agentic-eval's sandbox limits

All the builds on docs.rs are executed inside a sandbox with limited resources. The limits for this crate are the following:

Available RAM 6.44 GB
Maximum rustdoc execution time 15m
Maximum size of a build log 102.4 kB
Network access blocked
Maximum number of build targets 10

If a build fails because it hit one of those limits please open an issue to get them increased.