## MXP Agents Runtime SDK
[](https://crates.io/crates/mxp-agents)
[](https://docs.rs/mxp-agents)
[](https://github.com/yafatek/mxpnexus/blob/main/LICENSE-MIT)
[](https://www.rust-lang.org)
**Production-grade Rust SDK for building autonomous AI agents that communicate over the [MXP protocol](https://github.com/yafatek/mxpnexus).**
Part of the MXP (Mesh eXchange Protocol) ecosystem, this SDK provides the runtime infrastructure for building, deploying, and operating AI agents that speak MXP natively. While the [`mxp`](https://crates.io/crates/mxp) crate handles wire protocol encoding/decoding and secure UDP transport, this SDK provides:
- **Agent lifecycle management** with deterministic state machines
- **MXP message handling** for Call, Response, Event, and Stream messages
- **Registry integration** for mesh discovery and heartbeats
- **LLM adapters** for OpenAI, Anthropic, Gemini, and Ollama
- **Enterprise features** including resilience, observability, and security
## Table of Contents
- [Quick Start](#quick-start)
- [Enterprise-Grade Capabilities](#enterprise-grade-capabilities)
- [Why MXP Agents Runtime](#why-it-exists)
- [Scope](#scope)
- [Production Readiness](#production-readiness)
- [Documentation](#documentation-map)
- [Examples](#examples)
- [Getting Started](#getting-started)
- [Requirements](#requirements)
- [Contributing](#contributing)
- [License](#license)
## Quick Start
Install via the bundled facade crate:
```sh
cargo add mxp-agents
```
**Basic LLM Usage**
```rust
use mxp_agents::adapters::ollama::{OllamaAdapter, OllamaConfig};
use mxp_agents::adapters::traits::{InferenceRequest, MessageRole, ModelAdapter, PromptMessage};
use futures::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create an adapter (works with OpenAI, Anthropic, Gemini, or Ollama)
// Use .with_stream(true) for incremental token streaming
let adapter = OllamaAdapter::new(
OllamaConfig::new("gemma2:2b")
.with_stream(true) // Enable streaming responses
)?;
// Build a request with system prompt
let request = InferenceRequest::new(vec![
PromptMessage::new(MessageRole::User, "What is MXP?"),
])?
.with_system_prompt("You are an expert on MXP protocol")
.with_temperature(0.7);
// Get streaming response
let mut stream = adapter.infer(request).await?;
// Process chunks as they arrive
while let Some(chunk) = stream.next().await {
let chunk = chunk?;
print!("{}", chunk.delta);
}
Ok(())
}
```
**MXP Agent Setup**
Agents communicate over the MXP protocol. Here's how to create an agent that handles MXP messages:
```rust
use mxp_agents::kernel::{
AgentKernel, AgentMessageHandler, HandlerContext, HandlerResult,
TaskScheduler, LifecycleEvent,
};
use mxp_agents::primitives::{AgentId, AgentManifest, Capability, CapabilityId};
use async_trait::async_trait;
use std::sync::Arc;
// Define your agent's message handler
struct MyAgentHandler;
#[async_trait]
impl AgentMessageHandler for MyAgentHandler {
async fn handle_call(&self, ctx: HandlerContext) -> HandlerResult {
// Process incoming MXP Call messages
let message = ctx.message();
println!("Received MXP call with {} bytes", message.payload().len());
Ok(())
}
}
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_id = AgentId::random();
let handler = Arc::new(MyAgentHandler);
let scheduler = TaskScheduler::default();
// Create the agent kernel
let mut kernel = AgentKernel::new(agent_id, handler, scheduler);
// Boot and activate the agent
kernel.transition(LifecycleEvent::Boot)?;
kernel.transition(LifecycleEvent::Activate)?;
println!("Agent {} is active and ready for MXP messages", agent_id);
Ok(())
}
```
**Production Setup with Resilience & Observability**
```rust
use mxp_agents::adapters::ollama::{OllamaAdapter, OllamaConfig};
use mxp_agents::adapters::resilience::{
CircuitBreakerConfig, RetryConfig, BackoffStrategy, ResilientAdapter,
};
use mxp_agents::telemetry::PrometheusExporter;
use std::time::Duration;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create resilient adapter with circuit breaker and retry
let base_adapter = OllamaAdapter::new(OllamaConfig::new("gemma2:2b"))?;
let resilient = ResilientAdapter::builder(base_adapter)
.with_circuit_breaker(CircuitBreakerConfig {
failure_threshold: 5,
cooldown: Duration::from_secs(30),
success_threshold: 2,
})
.with_retry(RetryConfig {
max_attempts: 3,
backoff: BackoffStrategy::Exponential {
base: Duration::from_millis(100),
max: Duration::from_secs(10),
jitter: true,
},
..Default::default()
})
.with_timeout_duration(Duration::from_secs(30))
.build();
// Set up metrics collection
let exporter = PrometheusExporter::new();
let _ = exporter.register_runtime();
let _ = exporter.register_adapter("ollama");
// Export Prometheus metrics
println!("{}", exporter.export());
Ok(())
}
```
See [examples/](examples/) for more complete examples including policy enforcement, memory integration, and graceful shutdown.
### Enterprise-Grade Capabilities
The SDK is production-hardened with features required for mission-critical deployments:
**Resilience & Reliability**
- Circuit breaker pattern prevents cascading failures
- Exponential backoff retry policies with jitter
- Request timeout enforcement with per-request overrides
- Automatic recovery from transient failures
**Observability & Monitoring**
- Prometheus-compatible metrics (latency, throughput, error rates, queue depth)
- Health checks for Kubernetes readiness/liveness probes
- OpenTelemetry trace propagation for distributed tracing
- Structured logging with correlation IDs
**Security & Compliance**
- Secrets management with redacted Debug output
- Per-agent rate limiting with token bucket algorithm
- Input validation with configurable size limits
- Audit events for policy enforcement and compliance
**Operations & Configuration**
- Layered configuration from defaults, files, environment, and runtime
- Hot reload for non-disruptive configuration updates
- Configuration digest for drift detection
- Graceful shutdown with in-flight work draining
- State recovery and checkpoint persistence
### Why it exists
- Provide a unified runtime that wraps LLMs, tools, memory, and governance without depending on QUIC or third-party transports.
- Ensure every agent built for MXP Nexus speaks MXP natively and adheres to platform security, observability, and performance rules.
- Offer a developer-friendly path to compose agents locally, then promote them into the MXP Nexus platform when ready.
- Enable production deployments with enterprise-grade resilience, observability, and security out of the box.
### Scope
**Core Runtime**
- Agent lifecycle management with deterministic state machine
- LLM connectors (OpenAI, Anthropic, Gemini, Ollama, MXP-hosted)
- Tool registration with capability-based access control
- Policy hooks for governance and compliance
- MXP message handling and protocol integration
- Memory integration (volatile cache, file journal, vector store interfaces)
**Enterprise Features**
- Resilience patterns (circuit breaker, retry, timeout)
- Observability (Prometheus metrics, health checks, distributed tracing)
- Security (secrets management, rate limiting, input validation)
- Configuration management (layered config, hot reload, validation)
- Graceful lifecycle (shutdown coordination, state recovery)
**Out of scope**: MXP Nexus deployment tooling, mesh scheduling, or any "deep agents" research-oriented SDK—handled by separate projects.
### Supported LLM stacks
- OpenAI, Anthropic, Gemini, Ollama, and future MXP-hosted models via a shared `ModelAdapter` trait.
### Production Readiness
The SDK is designed for production deployments with:
- **Zero-allocation hot paths** in call execution and scheduler loops
- **Comprehensive error handling** with exhaustive error types
- **Property-based testing** for correctness verification
- **Kubernetes integration** with health checks and graceful shutdown
- **Observability** with structured logging, metrics, and distributed tracing
- **Security** with secrets management, rate limiting, and input validation
All code passes `cargo fmt`, `cargo clippy --all-targets --all-features`, and `cargo test --all-features` gates.
### MXP integration
This SDK is part of the [MXP protocol](https://github.com/yafatek/mxpnexus) ecosystem. The `mxp` crate provides the transport primitives, while this SDK provides the agent runtime that speaks MXP natively.
**Protocol Relationship**
- `mxp` crate: Wire protocol, message encoding/decoding, UDP transport with ChaCha20-Poly1305 encryption
- `mxp-agents` crate: Agent runtime, lifecycle management, LLM adapters, tools, policy enforcement
**MXP Message Types**
Agents handle these MXP message types through the `AgentMessageHandler` trait:
- `AgentRegister` / `AgentHeartbeat` — Mesh registration and health
- `Call` / `Response` — Request-response communication
- `Event` — Fire-and-forget notifications
- `StreamOpen` / `StreamChunk` / `StreamClose` — Streaming data
**Registry Integration Example**
```rust
use mxp_agents::kernel::{
AgentKernel, MxpRegistryClient, RegistrationConfig, TaskScheduler,
};
use mxp_agents::primitives::{AgentId, AgentManifest, Capability, CapabilityId};
use std::net::SocketAddr;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let agent_id = AgentId::random();
// Define agent capabilities
let capability = Capability::builder(CapabilityId::new("chat.respond")?)
.name("Chat Response")?
.version("1.0.0")?
.add_scope("chat:write")?
.build()?;
// Create agent manifest
let manifest = AgentManifest::builder(agent_id)
.name("my-chat-agent")?
.version("0.1.0")?
.capabilities(vec![capability])
.build()?;
// Connect to MXP registry for mesh discovery
let agent_endpoint: SocketAddr = "127.0.0.1:50052".parse()?;
let registry = Arc::new(MxpRegistryClient::connect(
"127.0.0.1:50051", // Registry endpoint
agent_endpoint,
None,
)?);
// Create kernel with registry integration
let handler = Arc::new(MyAgentHandler);
let mut kernel = AgentKernel::new(agent_id, handler, TaskScheduler::default());
kernel.set_registry(registry, manifest, RegistrationConfig::default());
// Agent will auto-register and send heartbeats
kernel.transition(LifecycleEvent::Boot)?;
kernel.transition(LifecycleEvent::Activate)?;
Ok(())
}
```
### Key concepts
- Tools are pure Rust functions annotated with `#[tool]`; the SDK converts them into schemas consumable by LLMs and enforces capability scopes at runtime.
- Agents can share external state (memory bus, MXP Vector Store) or remain fully isolated.
- Governance and policy enforcement are first-class: hooks exist for allow/deny decisions and human-in-the-loop steps.
### System Prompts
All adapters support system prompts with provider-native optimizations:
```rust
use mxp_agents::adapters::openai::{OpenAiAdapter, OpenAiConfig};
use mxp_agents::adapters::anthropic::{AnthropicAdapter, AnthropicConfig};
use mxp_agents::adapters::gemini::{GeminiAdapter, GeminiConfig};
use mxp_agents::adapters::traits::InferenceRequest;
// OpenAI/Ollama: Prepends as first message
let openai = OpenAiAdapter::new(OpenAiConfig::from_env("gpt-4"))?;
// Anthropic: Uses dedicated 'system' parameter
let anthropic = AnthropicAdapter::new(AnthropicConfig::from_env("claude-3-5-sonnet-20241022"))?;
// Gemini: Uses 'systemInstruction' field
let gemini = GeminiAdapter::new(GeminiConfig::from_env("gemini-1.5-pro"))?;
// Same API works across all providers
let request = InferenceRequest::new(messages)?
.with_system_prompt("You are a helpful assistant");
```
### Context Window Management (Optional)
For long conversations, enable automatic context management:
```rust
use mxp_agents::prompts::ContextWindowConfig;
use mxp_agents::adapters::ollama::{OllamaAdapter, OllamaConfig};
let adapter = OllamaAdapter::new(OllamaConfig::new("gemma2:2b"))?
.with_context_config(ContextWindowConfig {
max_tokens: 4096,
recent_window_size: 10,
..Default::default()
});
// SDK automatically manages conversation history within token budget
```
### Documentation Map
- `docs/overview.md` — architectural overview and design principles
- `docs/architecture.md` — crate layout, component contracts, roadmap
- `docs/features.md` — complete feature set and facade feature flags
- `docs/usage.md` — end-to-end setup guide for building agents
- `docs/enterprise.md` — production hardening guide with resilience, observability, and security
- `docs/errors.md` — error surfaces and troubleshooting tips
### Examples
- `examples/basic-agent` — simple agent with Ollama adapter and policy enforcement
- `examples/enterprise-agent` — production-grade agent demonstrating resilience, metrics, health checks, and graceful shutdown
### Getting Started
1. **Development**: Start with `examples/basic-agent` to understand core concepts
2. **Production**: Review `docs/enterprise.md` and `examples/enterprise-agent` for hardening patterns
3. **Integration**: Wire MXP endpoints for discovery and message handling
4. **Deployment**: Use health checks and metrics for Kubernetes integration
### Performance & Reliability
- **Sub-microsecond message encoding/decoding** via MXP protocol
- **Lock-free data structures** for high-concurrency scenarios
- **Bounded memory usage** with configurable limits
- **Automatic recovery** from transient failures
- **Graceful degradation** under load with rate limiting
- **Comprehensive testing** with property-based tests for correctness
### Security
- **Secrets management** with redacted Debug output
- **Rate limiting** to prevent resource exhaustion
- **Input validation** with configurable constraints
- **Audit events** for compliance and governance
- **Capability-based access control** for tools
- **Policy enforcement** with allow/deny/escalate decisions
### Observability
- **Prometheus metrics** for monitoring and alerting
- **Health checks** for Kubernetes integration
- **Distributed tracing** with OpenTelemetry support
- **Structured logging** with correlation IDs
- **Circuit breaker state tracking** for failure visibility
- **Request latency histograms** for performance analysis
## Requirements
- **Rust**: 1.85 or later (MSRV)
- **Tokio**: Async runtime (included via dependencies)
- **Optional**: Ollama, OpenAI, Anthropic, or Gemini API keys for LLM adapters
## Troubleshooting
### Circuit Breaker Opens Frequently
If the circuit breaker is opening too often:
- Increase `failure_threshold` in `CircuitBreakerConfig`
- Check provider status and connectivity
- Review timeout settings
### High Memory Usage
If memory usage is growing:
- Enable metrics cardinality limiting
- Check configuration hot reload is working
- Review rate limiter cleanup
### Slow Inference
If inference is slower than expected:
- Check `request_latency_seconds` metrics
- Verify provider API status
- Review retry and timeout configuration
See [docs/enterprise.md](docs/enterprise.md) for comprehensive troubleshooting guide.
## Contributing
We welcome contributions! Please see [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
### Development
```bash
# Build all crates
cargo build --all-features
# Run tests
cargo test --all-features
# Run linting
cargo clippy --all-targets --all-features -- -D warnings
# Format code
cargo fmt --check
```
## Community
- **GitHub Issues**: [Report bugs or request features](https://github.com/yafatek/mxpnexus/issues)
- **GitHub Discussions**: [Ask questions and discuss ideas](https://github.com/yafatek/mxpnexus/discussions)
- **Documentation**: [Full API docs](https://docs.rs/mxp-agents)
## License
Licensed under either of:
- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
- MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
at your option.
## Acknowledgments
Built with [Rust](https://www.rust-lang.org), [Tokio](https://tokio.rs), and the MXP protocol specification.