Expand description
Core runtime for AI agent execution with integrated safety and cost controls.
Provides agent lifecycle management, Python bindings for LangChain/CrewAI integration, and local LLM proxy for request interception. Orchestrates all Iron Runtime subsystems (budget, PII detection, analytics, circuit breakers).
§Purpose
This crate is the execution engine for Iron Runtime:
- Agent lifecycle management (spawn, monitor, stop agents)
- Python-Rust bridge via PyO3 for seamless Python integration
- LLM Router: Local proxy intercepting OpenAI/Anthropic API calls
- Integrated safety controls (PII detection, budget enforcement)
- Real-time metrics and state management
- Dashboard integration via REST API and WebSocket
§Architecture
Iron Runtime uses a modular architecture with clear separation:
§Core Components
- Agent Runtime: Manages agent processes and lifecycle
- PyO3 Bridge: Exposes Rust runtime to Python as
iron_cagemodule - LLM Router: Transparent proxy for LLM API requests
- State Manager: Persists agent state and metrics
- Telemetry: Structured logging for all operations
§Integration Layer
Runtime coordinates between modules:
- iron_cost: Budget validation before LLM requests
- iron_safety: PII scanning on LLM responses
- iron_runtime_analytics: Event tracking for dashboard
- iron_reliability: Circuit breakers for provider failures
- iron_runtime_state: Agent state persistence
§Execution Flow
Python Agent Script
↓
PyO3 Bridge (iron_cage module)
↓
Agent Runtime (spawn/monitor)
↓
LLM Router (intercept API calls)
↓
Safety Pipeline:
1. Budget check (iron_cost)
2. Circuit breaker check (iron_reliability)
3. Forward to LLM provider
4. PII detection on response (iron_safety)
5. Record analytics (iron_runtime_analytics)
6. Return to agent§Key Types
AgentRuntime- Main runtime managing agent lifecycleRuntimeConfig- Runtime configuration (budget, verbosity)AgentHandle- Handle to running agent for controlpyo3_bridge::Runtime- Python-exposed runtime classllm_router::LlmRouter- Local LLM proxy server
§Public API
§Rust API
use iron_runtime::{AgentRuntime, RuntimeConfig};
use std::path::Path;
#[tokio::main]
async fn main() -> Result<(), anyhow::Error> {
// Configure runtime
let config = RuntimeConfig {
budget: 100.0, // $100 budget
verbose: true,
};
// Create runtime
let runtime = AgentRuntime::new(config);
// Start agent from Python script
let handle = runtime.start_agent(Path::new("agent.py")).await?;
println!("Agent started: {}", handle.agent_id.as_str());
// Monitor metrics
if let Some(metrics) = runtime.get_metrics(handle.agent_id.as_str()) {
println!("Budget spent: ${}", metrics.budget_spent);
println!("PII detections: {}", metrics.pii_detections);
}
// Stop agent
runtime.stop_agent(handle.agent_id.as_str()).await?;
Ok(())
}§Python API
Python agents import iron_cage module for integrated controls:
from iron_cage import Runtime, LlmRouter
from langchain.agents import AgentExecutor
from langchain_openai import ChatOpenAI
# Create runtime with budget
runtime = Runtime(budget=100.0, verbose=True)
# Start LLM router (intercepts API calls)
router = LlmRouter(port=8000)
router.start()
# Point LangChain to local router instead of OpenAI directly
llm = ChatOpenAI(
base_url="http://localhost:8000/v1",
api_key="your-key" # Forwarded to real provider
)
# All LLM calls now go through Iron Runtime safety pipeline
agent = AgentExecutor(llm=llm, ...)
result = agent.run("Process this data...")
# Get metrics
metrics = runtime.get_metrics(agent_id)
print(f"Budget spent: ${metrics['budget_spent']}")
print(f"PII detections: {metrics['pii_detections']}")
# Stop when done
runtime.stop_agent(agent_id)
router.stop()§LLM Router Usage
The LLM Router acts as a transparent proxy:
from iron_cage import LlmRouter
# Start router on port 8000
router = LlmRouter(port=8000)
router.start()
# Now any HTTP client can use it
# Point your LLM library to: http://localhost:8000/v1
# Router supports:
# - OpenAI API format (/v1/chat/completions)
# - Anthropic API format (/v1/messages)
# - Streaming responses
# - Budget enforcement
# - PII detection
# - Request tracing§Python Integration
§PyO3 Module
Iron Runtime compiles to a Python extension module iron_cage.so:
# Build Python module
maturin develop --release
# Import in Python
import iron_cage
runtime = iron_cage.Runtime(budget=100.0)§LangChain Integration
Seamless integration with LangChain agents:
from langchain.agents import initialize_agent, Tool
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter
# Setup Iron Runtime
runtime = Runtime(budget=50.0)
router = LlmRouter(port=8000)
router.start()
# Configure LangChain to use local router
llm = ChatOpenAI(base_url="http://localhost:8000/v1")
# Create agent with Iron Runtime controls
tools = [Tool(name="search", func=search_function, ...)]
agent = initialize_agent(tools, llm, agent="zero-shot-react")
# All LLM calls automatically protected by Iron Runtime
result = agent.run("Research topic and generate report")§CrewAI Integration
Works with CrewAI multi-agent frameworks:
from crewai import Agent, Task, Crew
from langchain_openai import ChatOpenAI
from iron_cage import Runtime, LlmRouter
runtime = Runtime(budget=100.0)
router = LlmRouter(port=8000)
router.start()
llm = ChatOpenAI(base_url="http://localhost:8000/v1")
# Create crew with protected LLM
agent = Agent(role="Researcher", llm=llm, ...)
task = Task(description="...", agent=agent)
crew = Crew(agents=[agent], tasks=[task])
# Execute with Iron Runtime protection
result = crew.kickoff()§Safety Controls
Runtime enforces multiple safety layers:
§Budget Enforcement
- Pre-request budget validation
- Request blocked if budget exceeded
- Real-time cost tracking
- Budget alerts at configurable thresholds
§PII Detection
- Scans all LLM responses for PII
- Automatic redaction of sensitive data
- Compliance audit logging
- Configurable detection patterns
§Circuit Breakers
- Detects failing LLM providers
- Fast-fail on known-bad endpoints
- Automatic recovery after timeout
- Per-provider state isolation
§Feature Flags
enabled- Enable full runtime (disabled for library-only builds)
§Performance
Runtime overhead on LLM requests:
- Budget check: <1ms
- PII detection: <5ms per KB
- Circuit breaker check: <0.1ms
- Analytics recording: <0.5ms
- Total proxy overhead: <10ms per request
Streaming responses have near-zero buffering latency.
§Development Status
Current implementation status:
- ✓ Agent lifecycle management
- ✓ PyO3 module structure
- ✓ State management
- ✓ Telemetry integration
- ⏳ LLM Router implementation (in progress)
- ⏳ Async PyO3 bridge (planned)
- ⏳ Full safety pipeline integration (planned)
Modules§
- llm_
router - LLM Router - Local proxy for LLM API requests
- pyo3_
bridge
Structs§
- Agent
Handle - Agent runtime handle
- Agent
Runtime - Main agent runtime
- Runtime
Config - Runtime configuration