Crate iron_runtime

Expand description

Core runtime for AI agent execution with integrated safety and cost controls.

Provides agent lifecycle management, Python bindings for LangChain/CrewAI integration, and local LLM proxy for request interception. Orchestrates all Iron Runtime subsystems (budget, PII detection, analytics, circuit breakers).

§Purpose

This crate is the execution engine for Iron Runtime:

Agent lifecycle management (spawn, monitor, stop agents)
Python-Rust bridge via PyO3 for seamless Python integration
LLM Router: Local proxy intercepting OpenAI/Anthropic API calls
Integrated safety controls (PII detection, budget enforcement)
Real-time metrics and state management
Dashboard integration via REST API and WebSocket

§Architecture

Iron Runtime uses a modular architecture with clear separation:

§Core Components

Agent Runtime: Manages agent processes and lifecycle
PyO3 Bridge: Exposes Rust runtime to Python as iron_cage module
LLM Router: Transparent proxy for LLM API requests
State Manager: Persists agent state and metrics
Telemetry: Structured logging for all operations

§Integration Layer

Runtime coordinates between modules:

iron_cost: Budget validation before LLM requests
iron_safety: PII scanning on LLM responses
iron_runtime_analytics: Event tracking for dashboard
iron_reliability: Circuit breakers for provider failures
iron_runtime_state: Agent state persistence

§Execution Flow

Python Agent Script
       ↓
PyO3 Bridge (iron_cage module)
       ↓
Agent Runtime (spawn/monitor)
       ↓
LLM Router (intercept API calls)
       ↓
Safety Pipeline:
  1. Budget check (iron_cost)
  2. Circuit breaker check (iron_reliability)
  3. Forward to LLM provider
  4. PII detection on response (iron_safety)
  5. Record analytics (iron_runtime_analytics)
  6. Return to agent

§Key Types

AgentRuntime - Main runtime managing agent lifecycle
RuntimeConfig - Runtime configuration (budget, verbosity)
AgentHandle - Handle to running agent for control
pyo3_bridge::Runtime - Python-exposed runtime class
llm_router::LlmRouter - Local LLM proxy server

§Public API

§Rust API

use iron_runtime::{AgentRuntime, RuntimeConfig};
use std::path::Path;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
  // Configure runtime
  let config = RuntimeConfig {
    budget: 100.0,  // $100 budget
    verbose: true,
  };

  // Create runtime
  let runtime = AgentRuntime::new(config);

  // Start agent from Python script
  let handle = runtime.start_agent(Path::new("agent.py")).await?;
  println!("Agent started: {}", handle.agent_id.as_str());

  // Monitor metrics
  if let Some(metrics) = runtime.get_metrics(handle.agent_id.as_str()) {
    println!("Budget spent: ${}", metrics.budget_spent);
    println!("PII detections: {}", metrics.pii_detections);
  }

  // Stop agent
  runtime.stop_agent(handle.agent_id.as_str()).await?;
  Ok(())
}

§Python API

Python agents import iron_cage module for integrated controls:

from iron_cage import Runtime, LlmRouter
from langchain.agents import AgentExecutor
from langchain_openai import ChatOpenAI

# Create runtime with budget
runtime = Runtime(budget=100.0, verbose=True)

# Start LLM router (intercepts API calls)
router = LlmRouter(port=8000)
router.start()

# Point LangChain to local router instead of OpenAI directly
llm = ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-key"  # Forwarded to real provider
)

# All LLM calls now go through Iron Runtime safety pipeline
agent = AgentExecutor(llm=llm, ...)
result = agent.run("Process this data...")

# Get metrics
metrics = runtime.get_metrics(agent_id)
print(f"Budget spent: ${metrics['budget_spent']}")
print(f"PII detections: {metrics['pii_detections']}")

# Stop when done
runtime.stop_agent(agent_id)
router.stop()

§LLM Router Usage

The LLM Router acts as a transparent proxy:

from iron_cage import LlmRouter

# Start router on port 8000
router = LlmRouter(port=8000)
router.start()

# Now any HTTP client can use it
# Point your LLM library to: http://localhost:8000/v1
# Router supports:
# - OpenAI API format (/v1/chat/completions)
# - Anthropic API format (/v1/messages)
# - Streaming responses
# - Budget enforcement
# - PII detection
# - Request tracing

§Python Integration

§PyO3 Module

Iron Runtime compiles to a Python extension module iron_cage.so:

# Build Python module
maturin develop --release

# Import in Python
import iron_cage
runtime = iron_cage.Runtime(budget=100.0)

§LangChain Integration

Seamless integration with LangChain agents:

from langchain.agents import initialize_agent, Tool
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter

# Setup Iron Runtime
runtime = Runtime(budget=50.0)
router = LlmRouter(port=8000)
router.start()

# Configure LangChain to use local router
llm = ChatOpenAI(base_url="http://localhost:8000/v1")

# Create agent with Iron Runtime controls
tools = [Tool(name="search", func=search_function, ...)]
agent = initialize_agent(tools, llm, agent="zero-shot-react")

# All LLM calls automatically protected by Iron Runtime
result = agent.run("Research topic and generate report")

§CrewAI Integration

Works with CrewAI multi-agent frameworks:

from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter

runtime = Runtime(budget=100.0)
router = LlmRouter(port=8000)
router.start()

llm = ChatOpenAI(base_url="http://localhost:8000/v1")

# Create crew with protected LLM
agent = Agent(role="Researcher", llm=llm, ...)
task = Task(description="...", agent=agent)
crew = Crew(agents=[agent], tasks=[task])

# Execute with Iron Runtime protection
result = crew.kickoff()

§Safety Controls

Runtime enforces multiple safety layers:

§Budget Enforcement

Pre-request budget validation
Request blocked if budget exceeded
Real-time cost tracking
Budget alerts at configurable thresholds

§PII Detection

Scans all LLM responses for PII
Automatic redaction of sensitive data
Compliance audit logging
Configurable detection patterns

§Circuit Breakers

Detects failing LLM providers
Fast-fail on known-bad endpoints
Automatic recovery after timeout
Per-provider state isolation

§Feature Flags

enabled - Enable full runtime (disabled for library-only builds)

§Performance

Runtime overhead on LLM requests:

Budget check: <1ms
PII detection: <5ms per KB
Circuit breaker check: <0.1ms
Analytics recording: <0.5ms
Total proxy overhead: <10ms per request

Streaming responses have near-zero buffering latency.

§Development Status

Current implementation status:

✓ Agent lifecycle management
✓ PyO3 module structure
✓ State management
✓ Telemetry integration
⏳ LLM Router implementation (in progress)
⏳ Async PyO3 bridge (planned)
⏳ Full safety pipeline integration (planned)

Modules§

llm_router: LLM Router - Local proxy for LLM API requests
pyo3_bridge

Structs§

AgentHandle: Agent runtime handle
AgentRuntime: Main agent runtime
RuntimeConfig: Runtime configuration