MXP Agents Runtime SDK
Production-grade Rust SDK for building autonomous AI agents that communicate over the MXP protocol.
Part of the MXP (Mesh eXchange Protocol) ecosystem, this SDK provides the runtime infrastructure for building, deploying, and operating AI agents that speak MXP natively. While the mxp crate handles wire protocol encoding/decoding and secure UDP transport, this SDK provides:
- Agent lifecycle management with deterministic state machines
- MXP message handling for Call, Response, Event, and Stream messages
- Registry integration for mesh discovery and heartbeats
- LLM adapters for OpenAI, Anthropic, Gemini, and Ollama
- Enterprise features including resilience, observability, and security
Table of Contents
- Quick Start
- Enterprise-Grade Capabilities
- Why MXP Agents Runtime
- Scope
- Production Readiness
- Documentation
- Examples
- Getting Started
- Requirements
- Contributing
- License
Quick Start
Install via the bundled facade crate:
Basic LLM Usage
use ;
use ;
use StreamExt;
async
MXP Agent Setup
Agents communicate over the MXP protocol. Here's how to create an agent that handles MXP messages:
use ;
use ;
use async_trait;
use Arc;
// Define your agent's message handler
;
async
Production Setup with Resilience & Observability
use ;
use ;
use PrometheusExporter;
use Duration;
async
See examples/ for more complete examples including policy enforcement, memory integration, and graceful shutdown.
Enterprise-Grade Capabilities
The SDK is production-hardened with features required for mission-critical deployments:
Resilience & Reliability
- Circuit breaker pattern prevents cascading failures
- Exponential backoff retry policies with jitter
- Request timeout enforcement with per-request overrides
- Automatic recovery from transient failures
Observability & Monitoring
- Prometheus-compatible metrics (latency, throughput, error rates, queue depth)
- Health checks for Kubernetes readiness/liveness probes
- OpenTelemetry trace propagation for distributed tracing
- Structured logging with correlation IDs
Security & Compliance
- Secrets management with redacted Debug output
- Per-agent rate limiting with token bucket algorithm
- Input validation with configurable size limits
- Audit events for policy enforcement and compliance
Operations & Configuration
- Layered configuration from defaults, files, environment, and runtime
- Hot reload for non-disruptive configuration updates
- Configuration digest for drift detection
- Graceful shutdown with in-flight work draining
- State recovery and checkpoint persistence
Why it exists
- Provide a unified runtime that wraps LLMs, tools, memory, and governance without depending on QUIC or third-party transports.
- Ensure every agent built for MXP Nexus speaks MXP natively and adheres to platform security, observability, and performance rules.
- Offer a developer-friendly path to compose agents locally, then promote them into the MXP Nexus platform when ready.
- Enable production deployments with enterprise-grade resilience, observability, and security out of the box.
Scope
Core Runtime
- Agent lifecycle management with deterministic state machine
- LLM connectors (OpenAI, Anthropic, Gemini, Ollama, MXP-hosted)
- Tool registration with capability-based access control
- Policy hooks for governance and compliance
- MXP message handling and protocol integration
- Memory integration (volatile cache, file journal, vector store interfaces)
Enterprise Features
- Resilience patterns (circuit breaker, retry, timeout)
- Observability (Prometheus metrics, health checks, distributed tracing)
- Security (secrets management, rate limiting, input validation)
- Configuration management (layered config, hot reload, validation)
- Graceful lifecycle (shutdown coordination, state recovery)
Out of scope: MXP Nexus deployment tooling, mesh scheduling, or any "deep agents" research-oriented SDK—handled by separate projects.
Supported LLM stacks
- OpenAI, Anthropic, Gemini, Ollama, and future MXP-hosted models via a shared
ModelAdaptertrait.
Production Readiness
The SDK is designed for production deployments with:
- Zero-allocation hot paths in call execution and scheduler loops
- Comprehensive error handling with exhaustive error types
- Property-based testing for correctness verification
- Kubernetes integration with health checks and graceful shutdown
- Observability with structured logging, metrics, and distributed tracing
- Security with secrets management, rate limiting, and input validation
All code passes cargo fmt, cargo clippy --all-targets --all-features, and cargo test --all-features gates.
MXP integration
This SDK is part of the MXP protocol ecosystem. The mxp crate provides the transport primitives, while this SDK provides the agent runtime that speaks MXP natively.
Protocol Relationship
mxpcrate: Wire protocol, message encoding/decoding, UDP transport with ChaCha20-Poly1305 encryptionmxp-agentscrate: Agent runtime, lifecycle management, LLM adapters, tools, policy enforcement
MXP Message Types
Agents handle these MXP message types through the AgentMessageHandler trait:
AgentRegister/AgentHeartbeat— Mesh registration and healthCall/Response— Request-response communicationEvent— Fire-and-forget notificationsStreamOpen/StreamChunk/StreamClose— Streaming data
Registry Integration Example
use ;
use ;
use SocketAddr;
use Arc;
async
Key concepts
- Tools are pure Rust functions annotated with
#[tool]; the SDK converts them into schemas consumable by LLMs and enforces capability scopes at runtime. - Agents can share external state (memory bus, MXP Vector Store) or remain fully isolated.
- Governance and policy enforcement are first-class: hooks exist for allow/deny decisions and human-in-the-loop steps.
System Prompts
All adapters support system prompts with provider-native optimizations:
use ;
use ;
use ;
use InferenceRequest;
// OpenAI/Ollama: Prepends as first message
let openai = new?;
// Anthropic: Uses dedicated 'system' parameter
let anthropic = new?;
// Gemini: Uses 'systemInstruction' field
let gemini = new?;
// Same API works across all providers
let request = new?
.with_system_prompt;
Context Window Management (Optional)
For long conversations, enable automatic context management:
use ContextWindowConfig;
use ;
let adapter = new?
.with_context_config;
// SDK automatically manages conversation history within token budget
Documentation Map
docs/overview.md— architectural overview and design principlesdocs/architecture.md— crate layout, component contracts, roadmapdocs/features.md— complete feature set and facade feature flagsdocs/usage.md— end-to-end setup guide for building agentsdocs/enterprise.md— production hardening guide with resilience, observability, and securitydocs/errors.md— error surfaces and troubleshooting tips
Examples
examples/basic-agent— simple agent with Ollama adapter and policy enforcementexamples/enterprise-agent— production-grade agent demonstrating resilience, metrics, health checks, and graceful shutdown
Getting Started
- Development: Start with
examples/basic-agentto understand core concepts - Production: Review
docs/enterprise.mdandexamples/enterprise-agentfor hardening patterns - Integration: Wire MXP endpoints for discovery and message handling
- Deployment: Use health checks and metrics for Kubernetes integration
Performance & Reliability
- Sub-microsecond message encoding/decoding via MXP protocol
- Lock-free data structures for high-concurrency scenarios
- Bounded memory usage with configurable limits
- Automatic recovery from transient failures
- Graceful degradation under load with rate limiting
- Comprehensive testing with property-based tests for correctness
Security
- Secrets management with redacted Debug output
- Rate limiting to prevent resource exhaustion
- Input validation with configurable constraints
- Audit events for compliance and governance
- Capability-based access control for tools
- Policy enforcement with allow/deny/escalate decisions
Observability
- Prometheus metrics for monitoring and alerting
- Health checks for Kubernetes integration
- Distributed tracing with OpenTelemetry support
- Structured logging with correlation IDs
- Circuit breaker state tracking for failure visibility
- Request latency histograms for performance analysis
Requirements
- Rust: 1.85 or later (MSRV)
- Tokio: Async runtime (included via dependencies)
- Optional: Ollama, OpenAI, Anthropic, or Gemini API keys for LLM adapters
Troubleshooting
Circuit Breaker Opens Frequently
If the circuit breaker is opening too often:
- Increase
failure_thresholdinCircuitBreakerConfig - Check provider status and connectivity
- Review timeout settings
High Memory Usage
If memory usage is growing:
- Enable metrics cardinality limiting
- Check configuration hot reload is working
- Review rate limiter cleanup
Slow Inference
If inference is slower than expected:
- Check
request_latency_secondsmetrics - Verify provider API status
- Review retry and timeout configuration
See docs/enterprise.md for comprehensive troubleshooting guide.
Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development
# Build all crates
# Run tests
# Run linting
# Format code
Community
- GitHub Issues: Report bugs or request features
- GitHub Discussions: Ask questions and discuss ideas
- Documentation: Full API docs
License
Licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license (LICENSE-MIT or http://opensource.org/licenses/MIT)
at your option.