cortex-runtime 1.0.0

Every AI web agent is flying blind.

Your agent needs to find a product under $200 on Amazon. With a browser tool, it launches Chromium, loads the home page, types a search query, waits for results, clicks a product, reads the price, goes back, clicks the next one... 10-30 LLM calls. 20-120 seconds. One site.

The current tools don't work. Browser automation treats every page as a blank canvas — no structure, no context, no memory of where anything is. Scraping frameworks extract text but can't navigate. APIs exist for ~3% of websites. Your agent spends 90% of its tokens on finding things, not using them.

Cortex maps entire websites into navigable binary graphs in seconds — via HTTP, no browser needed. Your agent gets a complete picture: every page classified, every product priced, every link mapped. Query in microseconds. Navigate by graph, not by pixel.

cortex map amazon.com                  # Map the entire site (3-15 seconds, HTTP-only)
cortex compile amazon.com              # Generate typed Python/TS/GraphQL/OpenAPI clients
cortex wql "SELECT name, price FROM Product WHERE price < 200 ORDER BY price ASC LIMIT 10"

Three commands. Zero browser launches. Zero LLM calls. Structured data from any website.

Benchmarks

Rust core. HTTP-first acquisition. Binary graph format. Real numbers from cargo test --release:

Operation	Time	Scale
Filter by page type	<1 µs	10,008 nodes
Filter + feature range	1 µs	10,008 nodes
WQL full pipeline	86 µs	10,008 nodes
Add node	1.7 µs	O(1) amortized
Add edge	3.2 µs	O(1) amortized
Pathfind (Dijkstra)	884 µs	10,008 nodes
Schema inference	96.9 ms	10,008 nodes
Serialize 10K nodes	134.9 ms	5.9 MB output
Deserialize 10K nodes	212 ms	5.9 MB input
Similarity search (top-10)	42.7 ms	10,008 nodes, 128-dim cosine

All benchmarks on Apple M4 Pro, 64 GB, macOS 26.2, Rust 1.90.0 --release. Binary format: ~600 bytes/node, 2.7x compression vs JSON, CRC32 integrity checks.

	Playwright	Puppeteer	Selenium	Browserbase	Cortex
Browser required	Always	Always	Always	Cloud	Never*
Package size	~280 MB	~300 MB	~350 MB	~0 (cloud)	17 MB
Site-level graph	No	No	No	No	Yes
Map a site	20-120s	20-120s	20-120s	20-120s	3-15s
Query latency	N/A	N/A	N/A	N/A	<1 µs
LLM calls to navigate	10-30	10-30	10-30	10-30	0
Structured data	No	No	No	No	Primary
Idle memory	~500 MB	~500 MB	~500 MB	0 (cloud)	~20 MB

*93% of sites mapped via HTTP only. Browser fallback available for the remaining ~5%.

Cortex was tested against 100 real production websites across 10 categories:

Category	Avg Score	Sites
Documentation	94.2	10
Financial	92.2	5
SPA / JS-heavy	91.0	10
Government	90.6	10
News / media	87.0	10
Social	80.9	10
E-commerce	77.5	15
Travel	76.7	10
Overall	85.3	100

80 of 100 sites score 80+. Primary failure modes: bot detection (10 sites), client-rendered SPAs with no structured data (4 sites).

Why Cortex

Web agents shouldn't need a browser. 93% of websites serve structured data (JSON-LD, OpenGraph, Schema.org, platform APIs) alongside their HTML. Cortex extracts it directly via HTTP. No rendering. No screenshots. No pixel coordinates.

Map once, query forever. A SiteMap is a binary graph — every page typed, every feature extracted, every link mapped. Once mapped, queries resolve in microseconds. Filter by page type, price range, rating. Pathfind between any two pages. Find similar products by vector similarity.

Any website becomes a typed API. The Web Compiler infers schemas from the binary map and generates client libraries in Python, TypeScript, OpenAPI, GraphQL, and MCP formats. One command turns amazon.com into AmazonClient.products.filter(price_lt=200).

SQL for websites. WQL (Web Query Language) brings familiar SQL syntax to web data: SELECT name, price FROM Product WHERE price < 200 ORDER BY rating DESC. Works across domains. No learning curve.

Works with every AI agent. One command (cortex plug) auto-discovers Claude Desktop, Claude Code, Cursor, Windsurf, Continue, and Cline — and injects 9 Cortex tools via MCP. Your agent gains web cartography without configuration.

How It Works

Map — Cortex maps a domain via layered HTTP acquisition: sitemap/robots.txt discovery, HTTP GET with JSON-LD/OpenGraph extraction, CSS-selector pattern engine, and platform API discovery. Each page is encoded into a 128-dimension feature vector. No browser needed — browser is a last-resort fallback for the ~5% of pages with no structured data.
Compile — The Web Compiler infers typed schemas from the binary map and generates client libraries in 5 formats (Python, TypeScript, OpenAPI, GraphQL, MCP). One command turns any website into a typed API.
Query — Use WQL (Web Query Language) — SQL for websites — to filter, sort, and compare data across domains. Or query the map directly by page type, feature ranges, or vector similarity. Pathfind between any two nodes.
Track — Push maps to the collective registry and track changes over time. Temporal intelligence detects trends, predicts future values, and sends alerts when conditions are met.
Act — Execute actions on live pages. Many actions (add-to-cart, search, form submission) work via HTTP POST. Complex interactions fall back to browser sessions.

The binary .ctx format uses fixed-size node records (~600 bytes/node), 128-dimensional feature vectors, indexed edge tables, and CRC32 integrity checks. 2.7x smaller than JSON. O(1) node access.

Range	Category	Key Features
0-15	Page identity	page_type, confidence, depth, load_time
16-47	Content metrics	text_density, heading_count, image_count, link_density
48-63	Commerce	price (USD), discount, availability, rating, review_count
64-79	Navigation	outbound_links, pagination, breadcrumb_depth, filters, sort_options
80-95	Trust/safety	TLS, domain_age, PII_exposure, tracker_count, authority_score
96-111	Actions	action_count, safe/cautious/destructive ratios, form_completeness
112-127	Session	login_state, session_duration, cart_value (zeroed for privacy sharing)

Layer	Method	Coverage	Cost
0	Discovery (robots.txt, sitemap.xml)	~90% URLs	3-5 HTTP requests
1	Structured data (JSON-LD, OpenGraph)	93% sites	1 GET per page
1.5	Pattern engine (CSS selectors)	93% sites	0 (in-memory)
2	API discovery (REST endpoints)	31% sites	1-3 probes
3	Browser fallback (Chromium)	<5% pages	2-5s per page

Install

Rust runtime (required):

cargo install cortex-runtime

One-liner (macOS / Linux):

curl -fsSL https://cortex.dev/install | bash

Python client:

pip install cortex-client

TypeScript client:

npm install @cortex-ai/client

Auto-connect all AI agents (Claude, Cursor, Windsurf, Continue, Cline):

cortex plug

Docker (no local install):

docker run -p 7700:7700 cortex-ai/cortex:lite    # 25 MB, HTTP-only
docker run -p 7700:7700 cortex-ai/cortex:full    # 350 MB, includes Chromium

No manual setup needed. cortex map handles Chromium installation and daemon lifecycle automatically on first run. Run cortex doctor to diagnose issues. See INSTALL.md for detailed instructions, platform-specific notes, and troubleshooting.

Quickstart

Map, compile, and query — 3 commands

cortex map amazon.com
cortex compile amazon.com
cortex wql "SELECT name, price FROM Product WHERE price < 200 ORDER BY price ASC LIMIT 10"

Python client

import cortex_client

# Map a website into a navigable graph
site = cortex_client.map("amazon.com")
print(f"Mapped {len(site.nodes)} pages, {len(site.edges)} links")

# Find products under $300
products = site.filter(page_type=0x04, features={48: {"lt": 300}}, limit=10)
for p in products:
    print(f"  {p.url}  price={p.features[48]}")

# Pathfind between pages
path = site.pathfind(0, products[0].index)
print(f"Path: {' → '.join(str(n) for n in path.nodes)}")

TypeScript client

import { map } from "@cortex-ai/client";

const site = await map("amazon.com");
console.log(`Mapped ${site.nodeCount} pages, ${site.edgeCount} links`);

const products = await site.filter({
  pageType: 4,
  features: { 48: { lt: 300 } },
  limit: 10,
});

MCP (Claude, Cursor, Windsurf)

cortex plug    # Auto-discovers agents, injects 9 tools via MCP

Then ask your agent: "Map amazon.com and find laptops under $500" — it uses the cortex_map and cortex_query tools automatically.

Compare across sites

result = cortex_client.compare(["amazon.com", "ebay.com", "walmart.com"])
print(result.common_types)   # Shared page types
print(result.unique_types)   # Per-site unique types

See docs/quickstart.md for a full 5-minute walkthrough.

CLI Commands

Command	Description
`cortex map <domain>`	Map a website into a binary graph
`cortex compile <domain>`	Generate typed clients (Python, TS, OpenAPI, GraphQL, MCP)
`cortex wql "<query>"`	Execute a WQL query (SQL for websites)
`cortex query <domain>`	Search a mapped site by type/features
`cortex pathfind <domain>`	Find shortest path between nodes
`cortex perceive <url>`	Analyze a single live page
`cortex history <domain> <url>`	Query temporal price/feature history
`cortex patterns <domain> <url>`	Detect trends, anomalies, periodicity
`cortex registry list\|stats\|gc`	Manage the collective map registry
`cortex plug`	Auto-discover AI agents and inject Cortex tools
`cortex doctor`	Check environment and diagnose issues
`cortex start` / `stop` / `status`	Manage the background daemon
`cortex install`	Download Chromium for Testing
`cortex cache clear`	Clear cached maps
`cortex completions <shell>`	Generate shell completions (bash/zsh/fish)

Global flags: --json --quiet --verbose --no-color

Ecosystem

Component	Package	Install
Runtime	`cortex-runtime`	`cargo install cortex-runtime`
Python client	`cortex-client`	`pip install cortex-client`
TypeScript client	`@cortex-ai/client`	`npm install @cortex-ai/client`
MCP server	`@cortex/mcp-server`	`cortex plug` (auto)
LangChain	`cortex-langchain`	`pip install cortex-langchain`
CrewAI	`cortex-crewai`	`pip install cortex-crewai`
AutoGen	`cortex-autogen`	`pip install cortex-autogen`
Semantic Kernel	`cortex-semantic-kernel`	`pip install cortex-semantic-kernel`
OpenClaw	`cortex-openclaw`	`pip install cortex-openclaw`
Docker (lite)	`cortex-ai/cortex:lite`	25 MB, HTTP-only
Docker (full)	`cortex-ai/cortex:full`	350 MB, includes Chromium

Validation

Suite	Tests	Notes
Rust core engine	375	Unit + integration + benchmarks
Conformance tests	10	JSON-driven cross-platform validation
Gateway integration	12	MCP, REST, client, adapter tests
Total	397	All passing

Gateway integration score: 90/100 — MCP server 100%, REST API 90%, Python client 88%, framework adapters 73%.

Research paper: 10 pages, 8 figures, 13 tables, 22 references — all real benchmark data

Project Structure

cortex/
├── runtime/              # Rust runtime (the core engine)
│   └── src/
│       ├── map/              # SiteMap types, builder, serializer, reader
│       ├── acquisition/      # HTTP-first data acquisition (JSON-LD, patterns, actions)
│       ├── cartography/      # Classification, feature encoding, mapping
│       ├── navigation/       # Query, pathfind, similarity, clustering
│       ├── compiler/         # Web Compiler (schema inference, 5-target codegen)
│       ├── wql/              # WQL parser, planner, executor
│       ├── collective/       # Collective graph, delta sync, registry
│       ├── temporal/         # History store, pattern detection, predictions
│       ├── live/             # Perceive, refresh, act, watch, sessions
│       ├── intelligence/     # Caching, progressive refinement
│       ├── trust/            # Credentials, PII, sanitization
│       ├── audit/            # JSONL logging, remote sync
│       └── cli/              # CLI commands
├── clients/
│   ├── python/           # Python thin client (cortex-client)
│   └── typescript/       # TypeScript thin client (@cortex-ai/client)
├── integrations/
│   ├── mcp-server/       # MCP server for Claude, Cursor, Windsurf, Continue, Cline
│   ├── langchain/        # LangChain adapter
│   ├── crewai/           # CrewAI adapter
│   ├── autogen/          # AutoGen adapter
│   ├── semantic-kernel/  # Semantic Kernel plugin
│   └── openclaw/         # OpenClaw skills
├── examples/             # 11 runnable example scripts
├── publication/          # Research paper (LaTeX + PDF)
├── assets/               # SVG diagrams and visuals
└── docs/                 # Guides, cookbooks, and reference

Guides

Guide	Description
Quickstart	5-minute onboarding — map, compile, query
Concepts	SiteMap format, feature vectors, acquisition layers
API Reference	Python, TypeScript, CLI, REST, and MCP APIs
Benchmarks	Detailed benchmark methodology and data
Integration Guide	MCP setup, framework adapters, Docker
Web Compiler	Turn any website into a typed API
WQL	SQL-like query language for web data
Temporal Intelligence	Track changes, detect trends, predict values
Collective Graph	Share and sync maps across agents
FAQ	Common questions and comparisons
Limitations	Honest list of v1.0 constraints

Known Limitations

Cortex v1.0 has 17 documented limitations including: ~5% of pages require browser fallback, ~10-15% of actions need browser execution, CAPTCHA solving not supported, currency not normalized, and WQL doesn't support JOINs or subqueries yet. We believe transparency about limitations builds more trust than hiding them.

Contributing

See CONTRIBUTING.md. The fastest ways to help:

Try it and file issues
Add a platform pattern — contribute CSS selectors or API patterns
Write an example — show a real use case with your framework
Improve docs — every clarification helps someone