mecha10-nodes-diagnostics 0.1.25

Diagnostics node that publishes Docker and system resource metrics
Documentation
# Diagnostics Node

A standalone node that publishes Docker container and system resource metrics to diagnostic topics.

## Purpose

The diagnostics node continuously monitors and publishes:
- **Docker container metrics**: CPU, memory, network, I/O per container
- **System resource metrics**: Host CPU, memory, disk, network

This complements the metrics published by `simulation-bridge` (streaming pipeline and Godot connection metrics).

## Architecture

### Metrics Published

**Every 5 seconds:**
- `/diagnostics/docker/containers` - One message per running container
- `/diagnostics/system/resources` - System-wide resource usage

### Collectors

- **DockerCollector**: Uses bollard (Docker API client) to fetch container stats
- **SystemCollector**: Uses sysinfo to fetch host system metrics

## Running

### Standalone

```bash
cargo run -p mecha10-nodes-diagnostics
```

### With Full System

```bash
# 1. Start control plane
docker compose up -d

# 2. Start diagnostics node
cargo run -p mecha10-nodes-diagnostics &

# 3. Start simulation (publishes streaming/Godot metrics)
cd my-robot
mecha10 dev
```

### Monitor Diagnostics

**CLI:**
```bash
# Watch all diagnostics
mecha10 diagnostics watch

# Docker only
mecha10 diagnostics watch --docker

# System only
mecha10 diagnostics watch --system
```

**Dashboard:**
```
Open browser: http://localhost:3000/dashboard/diagnostics
```

## Metrics Details

### Docker Container Metrics

Published to: `/diagnostics/docker/containers`

```rust
DockerContainerMetrics {
    container_id: String,        // Docker container ID
    container_name: String,      // Friendly name
    cpu_percent: f64,            // CPU usage percentage
    memory_usage_bytes: u64,     // Bytes used
    memory_limit_bytes: u64,     // Bytes limit
    network_rx_bytes: u64,       // Network received
    network_tx_bytes: u64,       // Network transmitted
    block_read_bytes: u64,       // Disk read
    block_write_bytes: u64,      // Disk write
}
```

### System Resource Metrics

Published to: `/diagnostics/system/resources`

```rust
SystemResourceMetrics {
    cpu_percent: f64,            // Total CPU usage
    memory_used_bytes: u64,      // Memory used
    memory_total_bytes: u64,     // Total memory
    memory_percent: f64,         // Memory usage %
    disk_used_bytes: u64,        // Disk used
    disk_total_bytes: u64,       // Total disk
    disk_percent: f64,           // Disk usage %
    network_rx_bytes: u64,       // Network received
    network_tx_bytes: u64,       // Network transmitted
}
```

## Dependencies

- `mecha10-core` - Framework context and pub/sub
- `mecha10-diagnostics` - Diagnostic collectors
- `tokio` - Async runtime
- `tracing` - Logging

## Performance

- **CPU overhead**: < 0.5%
- **Memory footprint**: < 5MB
- **Publishing frequency**: Every 5 seconds
- **Docker API calls**: Async, non-blocking

## Integration

### With simulation-bridge

`simulation-bridge` publishes:
- `/diagnostics/streaming/pipeline`
- `/diagnostics/streaming/encoding`
- `/diagnostics/streaming/bandwidth`
- `/diagnostics/godot/connection`

`diagnostics-node` publishes:
- `/diagnostics/docker/containers`
- `/diagnostics/system/resources`

Together they provide complete system visibility.

### With CLI

The `mecha10 diagnostics` CLI command subscribes to all diagnostic topics, so it automatically receives metrics from both nodes.

### With Dashboard

The `/dashboard/diagnostics` page subscribes to all diagnostic topics via WebSocket.

**Note:** Requires WebSocket server to forward Redis pub/sub messages to WebSocket clients.

## Troubleshooting

### No Docker metrics

**Problem**: Docker metrics not appearing

**Solutions**:
- Verify Docker is running: `docker ps`
- Check Docker socket permissions
- Ensure diagnostics-node is running: `ps aux | grep diagnostics-node`

### No system metrics

**Problem**: System metrics not appearing

**Solutions**:
- Check diagnostics-node logs
- Verify Redis is accessible: `redis-cli ping`
- Ensure diagnostics-node has system permissions

## Development

### Build

```bash
cargo build -p mecha10-nodes-diagnostics
```

### Test

```bash
# Run the node
cargo run -p mecha10-nodes-diagnostics

# In another terminal, subscribe to topics
redis-cli
> SUBSCRIBE /diagnostics/docker/containers
> SUBSCRIBE /diagnostics/system/resources
```

### Logs

```bash
RUST_LOG=debug cargo run -p mecha10-nodes-diagnostics
```

## See Also

- [Diagnostics System Summary]../../../DIAGNOSTICS_SYSTEM_FINAL_SUMMARY.md
- [CLI Diagnostics Command]../../../CLI_DIAGNOSTICS_COMPLETE.md
- [Dashboard Diagnostics Page]../../../DASHBOARD_DIAGNOSTICS_COMPLETE.md
- [Diagnostics Service Package]../../services/diagnostics/README.md