Crate iron_runtime

Crate iron_runtime 

Source
Expand description

Core runtime for AI agent execution with integrated safety and cost controls.

Provides agent lifecycle management, Python bindings for LangChain/CrewAI integration, and local LLM proxy for request interception. Orchestrates all Iron Runtime subsystems (budget, PII detection, analytics, circuit breakers).

§Purpose

This crate is the execution engine for Iron Runtime:

  • Agent lifecycle management (spawn, monitor, stop agents)
  • Python-Rust bridge via PyO3 for seamless Python integration
  • LLM Router: Local proxy intercepting OpenAI/Anthropic API calls
  • Integrated safety controls (PII detection, budget enforcement)
  • Real-time metrics and state management
  • Dashboard integration via REST API and WebSocket

§Architecture

Iron Runtime uses a modular architecture with clear separation:

§Core Components

  1. Agent Runtime: Manages agent processes and lifecycle
  2. PyO3 Bridge: Exposes Rust runtime to Python as iron_cage module
  3. LLM Router: Transparent proxy for LLM API requests
  4. State Manager: Persists agent state and metrics
  5. Telemetry: Structured logging for all operations

§Integration Layer

Runtime coordinates between modules:

  • iron_cost: Budget validation before LLM requests
  • iron_safety: PII scanning on LLM responses
  • iron_runtime_analytics: Event tracking for dashboard
  • iron_reliability: Circuit breakers for provider failures
  • iron_runtime_state: Agent state persistence

§Execution Flow

Python Agent Script
       ↓
PyO3 Bridge (iron_cage module)
       ↓
Agent Runtime (spawn/monitor)
       ↓
LLM Router (intercept API calls)
       ↓
Safety Pipeline:
  1. Budget check (iron_cost)
  2. Circuit breaker check (iron_reliability)
  3. Forward to LLM provider
  4. PII detection on response (iron_safety)
  5. Record analytics (iron_runtime_analytics)
  6. Return to agent

§Key Types

§Public API

§Rust API

use iron_runtime::{AgentRuntime, RuntimeConfig};
use std::path::Path;

#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
  // Configure runtime
  let config = RuntimeConfig {
    budget: 100.0,  // $100 budget
    verbose: true,
  };

  // Create runtime
  let runtime = AgentRuntime::new(config);

  // Start agent from Python script
  let handle = runtime.start_agent(Path::new("agent.py")).await?;
  println!("Agent started: {}", handle.agent_id.as_str());

  // Monitor metrics
  if let Some(metrics) = runtime.get_metrics(handle.agent_id.as_str()) {
    println!("Budget spent: ${}", metrics.budget_spent);
    println!("PII detections: {}", metrics.pii_detections);
  }

  // Stop agent
  runtime.stop_agent(handle.agent_id.as_str()).await?;
  Ok(())
}

§Python API

Python agents import iron_cage module for integrated controls:

from iron_cage import Runtime, LlmRouter
from langchain.agents import AgentExecutor
from langchain_openai import ChatOpenAI

# Create runtime with budget
runtime = Runtime(budget=100.0, verbose=True)

# Start LLM router (intercepts API calls)
router = LlmRouter(port=8000)
router.start()

# Point LangChain to local router instead of OpenAI directly
llm = ChatOpenAI(
    base_url="http://localhost:8000/v1",
    api_key="your-key"  # Forwarded to real provider
)

# All LLM calls now go through Iron Runtime safety pipeline
agent = AgentExecutor(llm=llm, ...)
result = agent.run("Process this data...")

# Get metrics
metrics = runtime.get_metrics(agent_id)
print(f"Budget spent: ${metrics['budget_spent']}")
print(f"PII detections: {metrics['pii_detections']}")

# Stop when done
runtime.stop_agent(agent_id)
router.stop()

§LLM Router Usage

The LLM Router acts as a transparent proxy:

from iron_cage import LlmRouter

# Start router on port 8000
router = LlmRouter(port=8000)
router.start()

# Now any HTTP client can use it
# Point your LLM library to: http://localhost:8000/v1
# Router supports:
# - OpenAI API format (/v1/chat/completions)
# - Anthropic API format (/v1/messages)
# - Streaming responses
# - Budget enforcement
# - PII detection
# - Request tracing

§Python Integration

§PyO3 Module

Iron Runtime compiles to a Python extension module iron_cage.so:

# Build Python module
maturin develop --release

# Import in Python
import iron_cage
runtime = iron_cage.Runtime(budget=100.0)

§LangChain Integration

Seamless integration with LangChain agents:

from langchain.agents import initialize_agent, Tool
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter

# Setup Iron Runtime
runtime = Runtime(budget=50.0)
router = LlmRouter(port=8000)
router.start()

# Configure LangChain to use local router
llm = ChatOpenAI(base_url="http://localhost:8000/v1")

# Create agent with Iron Runtime controls
tools = [Tool(name="search", func=search_function, ...)]
agent = initialize_agent(tools, llm, agent="zero-shot-react")

# All LLM calls automatically protected by Iron Runtime
result = agent.run("Research topic and generate report")

§CrewAI Integration

Works with CrewAI multi-agent frameworks:

from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter

runtime = Runtime(budget=100.0)
router = LlmRouter(port=8000)
router.start()

llm = ChatOpenAI(base_url="http://localhost:8000/v1")

# Create crew with protected LLM
agent = Agent(role="Researcher", llm=llm, ...)
task = Task(description="...", agent=agent)
crew = Crew(agents=[agent], tasks=[task])

# Execute with Iron Runtime protection
result = crew.kickoff()

§Safety Controls

Runtime enforces multiple safety layers:

§Budget Enforcement

  • Pre-request budget validation
  • Request blocked if budget exceeded
  • Real-time cost tracking
  • Budget alerts at configurable thresholds

§PII Detection

  • Scans all LLM responses for PII
  • Automatic redaction of sensitive data
  • Compliance audit logging
  • Configurable detection patterns

§Circuit Breakers

  • Detects failing LLM providers
  • Fast-fail on known-bad endpoints
  • Automatic recovery after timeout
  • Per-provider state isolation

§Feature Flags

  • enabled - Enable full runtime (disabled for library-only builds)

§Performance

Runtime overhead on LLM requests:

  • Budget check: <1ms
  • PII detection: <5ms per KB
  • Circuit breaker check: <0.1ms
  • Analytics recording: <0.5ms
  • Total proxy overhead: <10ms per request

Streaming responses have near-zero buffering latency.

§Development Status

Current implementation status:

  • ✓ Agent lifecycle management
  • ✓ PyO3 module structure
  • ✓ State management
  • ✓ Telemetry integration
  • ⏳ LLM Router implementation (in progress)
  • ⏳ Async PyO3 bridge (planned)
  • ⏳ Full safety pipeline integration (planned)

Modules§

llm_router
LLM Router - Local proxy for LLM API requests
pyo3_bridge

Structs§

AgentHandle
Agent runtime handle
AgentRuntime
Main agent runtime
RuntimeConfig
Runtime configuration