mecha10-runtime 0.1.0

Runtime supervisor for Mecha10 nodes - launching, monitoring, and lifecycle management
Documentation

mecha10-runtime

Runtime supervisor for Mecha10 nodes - launching, monitoring, and lifecycle management.

Features

  • Node Supervision: Launch and supervise multiple nodes with automatic restart policies
  • Health Checking: Monitor node health with customizable check intervals
  • Graceful Shutdown: Handle shutdown signals (SIGTERM, SIGINT, Ctrl+C) with configurable timeouts
  • Restart Policies: Never, OnFailure (with exponential backoff), or Always restart
  • Dynamic Launcher: Optional service for dynamically launching and managing nodes
  • Flexible Logging: Pretty, JSON, or Compact output formats with configurable log levels

Installation

[dependencies]
mecha10-runtime = "0.1.0"
async-trait = "0.1"
anyhow = "1.0"

Quick Start

Basic Single Node

use mecha10_runtime::prelude::*;
use anyhow::Result;

struct CameraNode {
    frame_count: u64,
}

#[async_trait]
impl NodeRunner for CameraNode {
    fn name(&self) -> &str {
        "camera"
    }

    async fn run(&mut self) -> Result<()> {
        loop {
            self.frame_count += 1;
            tokio::time::sleep(tokio::time::Duration::from_millis(33)).await;
        }
    }

    async fn health_check(&self) -> HealthStatus {
        if self.frame_count > 0 {
            HealthStatus::Healthy
        } else {
            HealthStatus::Unhealthy {
                reason: "No frames captured".to_string()
            }
        }
    }
}

#[tokio::main]
async fn main() -> Result<()> {
    let mut runtime = Runtime::builder()
        .log_level("info")
        .build();

    runtime.run_node("camera", Box::new(CameraNode { frame_count: 0 })).await?;
    Ok(())
}

Multiple Nodes with Restart Policy

use mecha10_runtime::prelude::*;
use std::time::Duration;

#[tokio::main]
async fn main() -> Result<()> {
    let mut runtime = Runtime::builder()
        .log_level("info")
        .restart_policy(RestartPolicy::OnFailure {
            max_retries: 3,
            backoff: Duration::from_secs(1),
        })
        .health_check_interval(Duration::from_secs(30))
        .shutdown_timeout(Duration::from_secs(10))
        .build();

    let nodes: Vec<Box<dyn NodeRunner>> = vec![
        Box::new(CameraNode { /* ... */ }),
        Box::new(MotorNode { /* ... */ }),
        Box::new(PlannerNode { /* ... */ }),
    ];

    runtime.run_nodes(nodes).await?;
    Ok(())
}

Dynamic Node Launcher

use mecha10_runtime::prelude::*;

#[tokio::main]
async fn main() -> Result<()> {
    let mut runtime = Runtime::builder()
        .with_launcher(true)
        .build();

    let launcher = runtime.launcher().unwrap();

    // Register node types
    launcher.register("camera".to_string(), || {
        Box::new(CameraNode { frame_count: 0 })
    }).await;

    launcher.register("motor".to_string(), || {
        Box::new(MotorNode::new())
    }).await;

    // Launch nodes dynamically
    launcher.launch("camera").await?;
    launcher.launch("motor").await?;

    // Run with launcher service
    runtime.run_with_launcher().await?;
    Ok(())
}

Runtime Configuration

Log Formats

// Pretty format (human-readable, colored)
Runtime::builder()
    .log_format(LogFormat::Pretty)
    .build();

// JSON format (structured logs for parsing)
Runtime::builder()
    .log_format(LogFormat::Json)
    .build();

// Compact format (minimal output)
Runtime::builder()
    .log_format(LogFormat::Compact)
    .build();

Restart Policies

// Never restart failed nodes
Runtime::builder()
    .restart_policy(RestartPolicy::Never)
    .build();

// Restart on failure with exponential backoff
Runtime::builder()
    .restart_policy(RestartPolicy::OnFailure {
        max_retries: 5,
        backoff: Duration::from_secs(1), // 1s, 2s, 4s, 8s, 16s
    })
    .build();

// Always restart with constant backoff
Runtime::builder()
    .restart_policy(RestartPolicy::Always {
        backoff: Duration::from_secs(5),
    })
    .build();

NodeRunner Trait

Implement the NodeRunner trait for each node type:

#[async_trait]
pub trait NodeRunner: Send + Sync {
    /// Get the node name
    fn name(&self) -> &str;

    /// Run the node (main logic)
    async fn run(&mut self) -> Result<()>;

    /// Perform a health check (default: Healthy)
    async fn health_check(&self) -> HealthStatus {
        HealthStatus::Healthy
    }

    /// Handle shutdown signal (default: no-op)
    async fn shutdown(&mut self) -> Result<()> {
        Ok(())
    }
}

Health Checking

// Custom health check logic
async fn health_check(&self) -> HealthStatus {
    if self.is_connected() {
        HealthStatus::Healthy
    } else if self.reconnecting() {
        HealthStatus::Degraded {
            reason: "Reconnecting to sensor".to_string()
        }
    } else {
        HealthStatus::Unhealthy {
            reason: "Sensor disconnected".to_string()
        }
    }
}

// Access health checker from runtime
let health_checker = runtime.health_checker();
let all_statuses = health_checker.check_all().await;
for (node_name, status) in all_statuses {
    println!("{}: {:?}", node_name, status);
}

Graceful Shutdown

The runtime automatically handles shutdown signals:

// Runs until Ctrl+C or SIGTERM
runtime.run_nodes(nodes).await?;

// Custom shutdown timeout
Runtime::builder()
    .shutdown_timeout(Duration::from_secs(30))
    .build();

// Custom shutdown logic in nodes
async fn shutdown(&mut self) -> Result<()> {
    self.save_state().await?;
    self.close_connections().await?;
    Ok(())
}

Architecture

Runtime
├── Supervisor (node lifecycle management)
│   ├── Launch nodes
│   ├── Monitor health
│   ├── Handle restarts
│   └── Coordinate shutdown
├── HealthChecker (health monitoring)
│   ├── Register nodes
│   ├── Track status
│   └── Report health
├── Launcher (dynamic node management) [optional]
│   ├── Register node factories
│   ├── Launch on demand
│   └── Stop running nodes
└── ShutdownHandle (graceful shutdown)
    ├── Signal handling
    └── Broadcast to nodes

Benefits

  • 90% Code Reduction: Generated projects go from 180 lines of boilerplate to ~20 lines
  • Consistent Behavior: All projects use the same tested runtime logic
  • Flexible Configuration: Builder pattern for easy customization
  • Production Ready: Health checking, restart policies, graceful shutdown
  • Observable: Structured logging with multiple output formats

Examples

See the rover-robot example for a complete demonstration of using mecha10-runtime.

License

MIT