aprender-compute 0.32.0

# TUI Compute Mode Flow: CPU, GPU, and Memory Monitoring

**Specification Version:** 1.1.0
**Status:** Review
**Date:** 2026-01-03
**Authors:** PAIML Engineering Team
**Validation:** 100% Probador Pixel-by-Pixel Tested

## Abstract

This specification defines a real-time Terminal User Interface (TUI) for monitoring compute flow, memory utilization, and data movement across heterogeneous hardware (CPU, NVIDIA GPU, AMD GPU, Apple Metal) in the Trueno compute ecosystem. It supports local and **distributed compute monitoring** (via **repartir**) and seamless **SSH-based remote monitoring** for heterogeneous clusters. The design follows Toyota Way principles (Iron Lotus Framework) and includes a comprehensive 100-point Popperian falsification test suite for QA validation.

---

## Table of Contents

1. [Architecture Overview](#1-architecture-overview)
2. [Hardware Abstraction Layer](#2-hardware-abstraction-layer)
3. [Memory Hierarchy Monitoring](#3-memory-hierarchy-monitoring)
4. [Compute Flow Visualization](#4-compute-flow-visualization)
5. [Data Flow Tracking](#5-data-flow-tracking)
6. [TUI Layout Specification](#6-tui-layout-specification)
7. [Stress Test Mode (--stress-test)](#7-stress-test-mode---stress-test)
8. [Probador Pixel Testing Integration](#8-probador-pixel-testing-integration)
9. [100-Point Popperian Falsification Suite](#9-100-point-popperian-falsification-suite)
10. [Peer-Reviewed Citations](#10-peer-reviewed-citations)
11. [Implementation Roadmap](#11-implementation-roadmap)

---

## 1. Architecture Overview

### 1.1 System Context

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                        TUI Compute Mode Flow Monitor                         │
├─────────────────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐   ┌─────────────┐   ┌─────────────┐   ┌─────────────┐      │
│  │   CPU       │   │ NVIDIA GPU  │   │  AMD GPU    │   │  Memory     │      │
│  │  Monitor    │   │   Monitor   │   │   Monitor   │   │  Monitor    │      │
│  └──────┬──────┘   └──────┬──────┘   └──────┬──────┘   └──────┬──────┘      │
│         │                 │                 │                 │              │
│         └────────────┬────┴────────────┬────┴─────────────────┘              │
│                      │                 │                                      │
│              ┌───────▼─────────────────▼───────┐                             │
│              │   Unified Telemetry Collector   │                             │
│              │     (trueno-gpu + sysinfo)      │                             │
│              └───────────────┬─────────────────┘                             │
│                              │                                                │
│              ┌───────────────▼─────────────────┐                             │
│              │    TUI Renderer (presentar)     │                             │
│              │   - Sparklines (60-point)       │                             │
│              │   - Gauges (memory bars)        │                             │
│              │   - Tables (process list)       │                             │
│              │   - Heatmaps (data flow)        │                             │
│              └─────────────────────────────────┘                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

### 1.2 Design Principles (Toyota Way)

| Principle | Application | Citation |
|-----------|-------------|----------|
| **Genchi Genbutsu** | Direct hardware sampling via trueno-gpu CUDA/ROCm bindings | [Liker2004] §10 |
| **Jidoka** | Automatic anomaly detection (Isolation Forest) with circuit breakers | [Liker2004] §11, [Liu2008] |
| **Heijunka** | Level-loaded polling (adaptive 10ms-1000ms intervals) | [Liker2004] §4 |
| **Muda** | Zero-copy telemetry with ring buffers | [Ohno1988] §3 |
| **Poka-Yoke** | Type-safe metric structs prevent unit confusion | [Shingo1986] §2 |
| **Mieruka** | Visual control (Sparklines, Heatmaps) for instant understanding | [Liker2004] §13, [Tufte2006] |

### 1.3 Integration Points

```rust
// Crate dependencies
trueno = { version = "0.10", features = ["gpu", "cuda-monitor"] }
trueno-gpu = { version = "0.4", features = ["cuda"] }
repartir = { version = "1.1", features = ["tui", "gpu", "remote-tls"] } // Validated v1.1
renacer = { version = "0.9", features = ["chaos-full", "otlp"] }
probar = { version = "0.4", features = ["tui", "gpu"] }
sysinfo = "0.32"
nvml-wrapper = "0.10"   // NVIDIA Management Library
rocm-smi-lib = "0.2"    // AMD ROCm System Management Interface
```

### 1.4 Hardware Verification Matrix

| Environment | Access Method | Primary Device | Backend | Status |
|-------------|---------------|----------------|---------|--------|
| **Linux Dev** | Local | RTX 4090 | CUDA/Vulkan | ✅ Verified |
| **Intel Mac** | SSH (`ssh mac`) | AMD Radeon Pro | Metal | ⚠️ **Required** |
| **Apple Silicon** | Local | M1/M2/M3 | Metal | ⏳ Pending |
| **AMD ROCm** | SSH | Instinct MI210 | ROCm | ⏳ Pending |

---

## 2. Hardware Abstraction Layer

### 2.1 Unified Device Trait

```rust
/// Unified compute device abstraction (TRUENO-SPEC-020)
pub trait ComputeDevice: Send + Sync {
    /// Device identification
    fn device_id(&self) -> DeviceId;
    fn device_name(&self) -> &str;
    fn device_type(&self) -> DeviceType;

    /// Compute metrics
    fn compute_utilization(&self) -> Result<f64>;      // 0.0-100.0%
    fn compute_clock_mhz(&self) -> Result<u32>;
    fn compute_temperature_c(&self) -> Result<f64>;
    fn compute_power_watts(&self) -> Result<f64>;
    fn compute_power_limit_watts(&self) -> Result<f64>;

    /// Memory metrics
    fn memory_used_bytes(&self) -> Result<u64>;
    fn memory_total_bytes(&self) -> Result<u64>;
    fn memory_bandwidth_gbps(&self) -> Result<f64>;

    /// Streaming multiprocessor / Compute Unit metrics
    fn sm_count(&self) -> u32;
    fn active_sm_count(&self) -> Result<u32>;

    /// PCIe / Interconnect metrics
    fn pcie_tx_bytes_per_sec(&self) -> Result<u64>;
    fn pcie_rx_bytes_per_sec(&self) -> Result<u64>;
    fn pcie_generation(&self) -> u8;
    fn pcie_width(&self) -> u8;
}

#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum DeviceType {
    Cpu,
    NvidiaGpu,
    AmdGpu,
    IntelGpu,
    AppleSilicon,
    Hpu, // Hardware Processing Unit (e.g., Gaudi, TPU)
}
```

### 2.2 NVIDIA GPU Implementation (via NVML)

```rust
/// NVIDIA GPU monitor using NVML (cuda-monitor feature)
pub struct NvidiaDevice {
    nvml: Nvml,
    device: Device,
    index: u32,
}

impl NvidiaDevice {
    pub fn enumerate() -> Result<Vec<Self>> {
        let nvml = Nvml::init()?;
        let count = nvml.device_count()?;
        (0..count)
            .map(|i| Ok(Self {
                nvml: nvml.clone(),
                device: nvml.device_by_index(i)?,
                index: i,
            }))
            .collect()
    }
}

impl ComputeDevice for NvidiaDevice {
    fn device_name(&self) -> &str {
        self.device.name().unwrap_or("Unknown NVIDIA GPU")
    }

    fn compute_utilization(&self) -> Result<f64> {
        let util = self.device.utilization_rates()?;
        Ok(util.gpu as f64)
    }

    fn memory_used_bytes(&self) -> Result<u64> {
        let mem = self.device.memory_info()?;
        Ok(mem.used)
    }

    fn memory_total_bytes(&self) -> Result<u64> {
        let mem = self.device.memory_info()?;
        Ok(mem.total)
    }

    fn compute_temperature_c(&self) -> Result<f64> {
        Ok(self.device.temperature(TemperatureSensor::Gpu)? as f64)
    }

    fn compute_power_watts(&self) -> Result<f64> {
        Ok(self.device.power_usage()? as f64 / 1000.0) // mW to W
    }
}
```

### 2.3 AMD GPU Implementation (via ROCm SMI)

```rust
/// AMD GPU monitor using ROCm SMI
pub struct AmdDevice {
    device_index: u32,
}

impl AmdDevice {
    pub fn enumerate() -> Result<Vec<Self>> {
        let count = rocm_smi::num_devices()?;
        (0..count).map(|i| Ok(Self { device_index: i })).collect()
    }
}

impl ComputeDevice for AmdDevice {
    fn device_name(&self) -> &str {
        rocm_smi::get_name(self.device_index)
            .unwrap_or("Unknown AMD GPU")
    }

    fn compute_utilization(&self) -> Result<f64> {
        Ok(rocm_smi::get_gpu_busy_percent(self.device_index)? as f64)
    }

    fn memory_used_bytes(&self) -> Result<u64> {
        rocm_smi::get_memory_usage(self.device_index)
    }

    fn memory_total_bytes(&self) -> Result<u64> {
        rocm_smi::get_memory_total(self.device_index)
    }

    fn compute_temperature_c(&self) -> Result<f64> {
        Ok(rocm_smi::get_temp_metric(
            self.device_index,
            RocmTemperatureType::Edge
        )? as f64 / 1000.0) // millidegrees to degrees
    }
}
```

### 2.4 CPU Implementation (via sysinfo)

```rust
/// CPU monitor using sysinfo crate
pub struct CpuDevice {
    system: System,
    core_count: usize,
}

impl CpuDevice {
    pub fn new() -> Self {
        let mut system = System::new_all();
        system.refresh_cpu();
        Self {
            core_count: system.cpus().len(),
            system,
        }
    }

    pub fn refresh(&mut self) {
        self.system.refresh_cpu();
        self.system.refresh_memory();
    }
}

impl ComputeDevice for CpuDevice {
    fn device_name(&self) -> &str {
        self.system.cpus().first()
            .map(|c| c.brand())
            .unwrap_or("Unknown CPU")
    }

    fn compute_utilization(&self) -> Result<f64> {
        let total: f32 = self.system.cpus().iter()
            .map(|c| c.cpu_usage())
            .sum();
        Ok((total / self.core_count as f32) as f64)
    }

    fn memory_used_bytes(&self) -> Result<u64> {
        Ok(self.system.used_memory())
    }

    fn memory_total_bytes(&self) -> Result<u64> {
        Ok(self.system.total_memory())
    }

    fn compute_temperature_c(&self) -> Result<f64> {
        // Platform-specific: Linux reads from /sys/class/thermal
        #[cfg(target_os = "linux")]
        {
            let temp = std::fs::read_to_string(
                "/sys/class/thermal/thermal_zone0/temp"
            )?;
            Ok(temp.trim().parse::<f64>()? / 1000.0)
        }
        #[cfg(not(target_os = "linux"))]
        Err(Error::NotSupported)
    }
}
```

---

## 3. Memory Hierarchy Monitoring

### 3.1 Memory Levels

```
┌────────────────────────────────────────────────────────────────────────────┐
│                         Memory Hierarchy View                               │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  L1 Cache (per core)     L2 Cache (shared)      L3 Cache (LLC)             │
│  ┌────────────────┐      ┌────────────────┐     ┌────────────────┐         │
│  │ █████░░░ 62%   │      │ ████████░ 89%  │     │ ███████░░ 78%  │         │
│  │ 32 KB / 32 KB  │      │ 256K / 256K    │     │ 30MB / 36MB    │         │
│  └────────────────┘      └────────────────┘     └────────────────┘         │
│                                                                             │
│  System RAM                           SWAP                                  │
│  ┌──────────────────────────────┐    ┌──────────────────────────────┐      │
│  │ ████████████████░░░░ 72.4%   │    │ ██░░░░░░░░░░░░░░░░░░ 8.2%    │      │
│  │ 46.3 GB / 64.0 GB            │    │ 1.3 GB / 16.0 GB             │      │
│  │ ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂ │    │ ▁▁▁▁▁▂▂▂▃▃▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁ │      │
│  └──────────────────────────────┘    └──────────────────────────────┘      │
│                                                                             │
│  GPU VRAM (NVIDIA RTX 4090)           GPU VRAM (AMD Radeon Pro W5700X)     │
│  ┌──────────────────────────────┐    ┌──────────────────────────────┐      │
│  │ ██████████████░░░░░░ 58.3%   │    │ ████████░░░░░░░░░░░░ 34.7%   │      │
│  │ 14.0 GB / 24.0 GB            │    │ 5.6 GB / 16.0 GB             │      │
│  │ ▁▂▃▄▅▆▇█▇▆▅▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆ │    │ ▁▁▂▂▃▃▄▄▅▅▄▄▃▃▂▂▁▁▁▁▂▂▃▃▄▄▅ │      │
│  └──────────────────────────────┘    └──────────────────────────────┘      │
│                                                                             │
└────────────────────────────────────────────────────────────────────────────┘
```

### 3.2 Memory Metrics Structure

```rust
/// Comprehensive memory metrics (TRUENO-SPEC-021)
#[derive(Debug, Clone)]
pub struct MemoryMetrics {
    // System RAM
    pub ram_used_bytes: u64,
    pub ram_total_bytes: u64,
    pub ram_available_bytes: u64,
    pub ram_cached_bytes: u64,
    pub ram_buffers_bytes: u64,

    // Swap
    pub swap_used_bytes: u64,
    pub swap_total_bytes: u64,

    // Per-GPU VRAM
    pub gpu_vram: Vec<GpuVramMetrics>,

    // Memory pressure (LAMBDA-0002)
    pub pressure_level: PressureLevel,
    pub safe_parallel_jobs: u32,

    // Bandwidth (measured)
    pub ram_read_bandwidth_gbps: f64,
    pub ram_write_bandwidth_gbps: f64,

    // History (60-point sparkline)
    pub ram_history: VecDeque<f64>,
    pub swap_history: VecDeque<f64>,
}

#[derive(Debug, Clone)]
pub struct GpuVramMetrics {
    pub device_id: DeviceId,
    pub used_bytes: u64,
    pub total_bytes: u64,
    pub reserved_bytes: u64,  // Driver/system reserved
    pub bar1_used_bytes: u64, // PCIe BAR1 aperture
    pub history: VecDeque<f64>,
}

/// Memory pressure levels (from lambda-lab-rust-development)
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum PressureLevel {
    Ok,        // >= 50% available
    Elevated,  // 30-50% available
    Warning,   // 15-30% available
    Critical,  // < 15% available
}
```

### 3.3 Memory Pressure Calculation

Based on [LAMBDA-0002] specification:

```rust
/// Calculate memory pressure and safe job count
pub fn analyze_pressure(metrics: &MemoryMetrics) -> PressureAnalysis {
    let available_pct = (metrics.ram_available_bytes as f64
        / metrics.ram_total_bytes as f64) * 100.0;

    let level = match available_pct {
        x if x >= 50.0 => PressureLevel::Ok,
        x if x >= 30.0 => PressureLevel::Elevated,
        x if x >= 15.0 => PressureLevel::Warning,
        _ => PressureLevel::Critical,
    };

    // Safe jobs = min(available_gb / 3.0, cpu_cores)
    // Based on 3GB/job heuristic [Volkov2008]
    let available_gb = metrics.ram_available_bytes as f64 / (1024.0 * 1024.0 * 1024.0);
    let cpu_cores = num_cpus::get() as u32;
    let safe_jobs = ((available_gb / 3.0) as u32).min(cpu_cores).max(1);

    PressureAnalysis {
        level,
        available_percent: available_pct as u32,
        available_gb,
        safe_jobs,
        block_builds: level == PressureLevel::Critical,
        recommendation: pressure_recommendation(level),
    }
}
```

---

## 4. Compute Flow Visualization

### 4.1 Compute Pipeline View

```
┌────────────────────────────────────────────────────────────────────────────┐
│                         Compute Pipeline Flow                               │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐  │
│  │   INPUT     │    │   COMPUTE   │    │   REDUCE    │    │   OUTPUT    │  │
│  │  (Host→Dev) │───▶│   (Kernel)  │───▶│   (Tile)    │───▶│  (Dev→Host) │  │
│  └─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘  │
│                                                                             │
│  Stage Latency:                                                             │
│  ├─ Input:   ████████████░░░░░░░░  2.34 ms (PCIe 4.0 x16)                  │
│  ├─ Compute: ██████████████████░░  8.92 ms (RTX 4090 @ 2520 MHz)           │
│  ├─ Reduce:  ████░░░░░░░░░░░░░░░░  0.87 ms (Tiled 16x16)                   │
│  └─ Output:  ██████░░░░░░░░░░░░░░  1.23 ms (DMA async)                     │
│                                                                             │
│  Total: 13.36 ms │ Throughput: 74.9 ops/s │ Efficiency: 89.2%              │
│                                                                             │
│  Active Kernels:                                                            │
│  ┌────────────────────────────────────────────────────────────────────┐    │
│  │ batched_gemm_tiled     [████████████░░░░░░░░] 58% │ 2048x2048x2048 │    │
│  │ fma_fusion_pass        [██████░░░░░░░░░░░░░░] 28% │ Optimizing...  │    │
│  │ tiled_reduction_sum    [████░░░░░░░░░░░░░░░░] 14% │ 16x16 tiles    │    │
│  └────────────────────────────────────────────────────────────────────┘    │
│                                                                             │
└────────────────────────────────────────────────────────────────────────────┘
```

### 4.2 Compute Metrics Structure

```rust
/// Compute pipeline metrics (TRUENO-SPEC-022)
#[derive(Debug, Clone)]
pub struct ComputeMetrics {
    // Per-device utilization
    pub devices: Vec<DeviceComputeMetrics>,

    // Active kernel tracking
    pub active_kernels: Vec<KernelExecution>,

    // Pipeline stage latencies
    pub input_latency_ms: f64,
    pub compute_latency_ms: f64,
    pub reduce_latency_ms: f64,
    pub output_latency_ms: f64,

    // Throughput
    pub operations_per_second: f64,
    pub flops_achieved: f64,
    pub flops_theoretical: f64,

    // Efficiency
    pub compute_efficiency_pct: f64,  // achieved/theoretical
    pub memory_efficiency_pct: f64,   // bandwidth utilization
}

#[derive(Debug, Clone)]
pub struct DeviceComputeMetrics {
    pub device_id: DeviceId,
    pub utilization_pct: f64,
    pub sm_active_pct: f64,
    pub warps_active: u32,
    pub warps_max: u32,
    pub clock_mhz: u32,
    pub clock_max_mhz: u32,
    pub power_watts: f64,
    pub power_limit_watts: f64,
    pub temperature_c: f64,
    pub throttle_reason: Option<ThrottleReason>,
    pub history: VecDeque<f64>, // 60-point sparkline
}

#[derive(Debug, Clone)]
pub struct KernelExecution {
    pub name: String,
    pub grid_dim: (u32, u32, u32),
    pub block_dim: (u32, u32, u32),
    pub shared_mem_bytes: usize,
    pub registers_per_thread: u32,
    pub occupancy_pct: f64,
    pub elapsed_ms: f64,
    pub status: KernelStatus,
}

#[derive(Debug, Clone, Copy)]
pub enum ThrottleReason {
    Power,
    Thermal,
    ApplicationClocks,
    SwPowerCap,
    HwSlowdown,
    SyncBoost,
    None,
}
```

---

## 5. Data Flow Tracking

### 5.1 Data Movement Visualization

```
┌────────────────────────────────────────────────────────────────────────────┐
│                          Data Flow Monitor                                  │
├────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  Host ◀════════════════════════════════════════════════════════▶ Device    │
│       │                                                         │           │
│  RAM  │  ┌─────────────────────────────────────────────────┐   │  VRAM     │
│       │  │     PCIe 4.0 x16: 31.5 GB/s theoretical         │   │           │
│       │  │     ══════════════════════════════════════════  │   │           │
│       │  │     TX: ██████████████░░░░░░  12.4 GB/s (39%)   │   │           │
│       │  │     RX: ████████░░░░░░░░░░░░   6.8 GB/s (22%)   │   │           │
│       │  └─────────────────────────────────────────────────┘   │           │
│       │                                                         │           │
│       │  Active Transfers:                                      │           │
│       │  ┌─────────────────────────────────────────────────┐   │           │
│       │  │ H→D  tensor_a      [████████████░░░░] 78%  1.2GB│   │           │
│       │  │ H→D  tensor_b      [██████████████░░] 89%  1.2GB│   │           │
│       │  │ D→H  result        [████░░░░░░░░░░░░] 23%  256MB│   │           │
│       │  └─────────────────────────────────────────────────┘   │           │
│       │                                                         │           │
│  ┌────▼────┐    ┌──────────┐    ┌──────────┐    ┌────────▼───┐ │           │
│  │ Pinned  │───▶│ Staging  │───▶│ Compute  │───▶│  Result    │ │           │
│  │ Buffer  │    │  Buffer  │    │  Buffer  │    │  Buffer    │ │           │
│  │ 4.0 GB  │    │  2.0 GB  │    │ 14.0 GB  │    │  2.0 GB    │ │           │
│  └─────────┘    └──────────┘    └──────────┘    └────────────┘ │           │
│                                                                             │
│  Memory Bus Utilization:                                                    │
│  ┌─────────────────────────────────────────────────────────────────────┐   │
│  │ GPU Memory: ████████████████████░░░░░░░░░░  672 GB/s (67%)          │   │
│  │ ▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅ (60s)  │   │
│  └─────────────────────────────────────────────────────────────────────┘   │
│                                                                             │
└────────────────────────────────────────────────────────────────────────────┘
```

### 5.2 Data Transfer Metrics

```rust
/// Data transfer tracking (TRUENO-SPEC-023)
#[derive(Debug, Clone)]
pub struct DataFlowMetrics {
    // PCIe metrics
    pub pcie_generation: u8,
    pub pcie_width: u8,
    pub pcie_theoretical_gbps: f64,
    pub pcie_tx_gbps: f64,
    pub pcie_rx_gbps: f64,

    // Active transfers
    pub active_transfers: Vec<Transfer>,
    pub completed_transfers: VecDeque<Transfer>, // Last 100

    // Memory bus
    pub memory_bus_utilization_pct: f64,
    pub memory_read_gbps: f64,
    pub memory_write_gbps: f64,

    // Buffer pools
    pub pinned_memory_used_bytes: u64,
    pub pinned_memory_total_bytes: u64,
    pub staging_buffer_used_bytes: u64,

    // History
    pub pcie_tx_history: VecDeque<f64>,
    pub pcie_rx_history: VecDeque<f64>,
    pub memory_bus_history: VecDeque<f64>,
}

#[derive(Debug, Clone)]
pub struct Transfer {
    pub id: TransferId,
    pub direction: TransferDirection,
    pub source: MemoryLocation,
    pub destination: MemoryLocation,
    pub size_bytes: u64,
    pub transferred_bytes: u64,
    pub start_time: Instant,
    pub end_time: Option<Instant>,
    pub status: TransferStatus,
    pub label: String,
}

#[derive(Debug, Clone, Copy)]
pub enum TransferDirection {
    HostToDevice,
    DeviceToHost,
    DeviceToDevice,
    PeerToPeer,
}

#[derive(Debug, Clone, Copy)]
pub enum MemoryLocation {
    SystemRam,
    PinnedMemory,
    GpuVram(DeviceId),
    UnifiedMemory,
}
```

---

## 6. TUI Layout Specification

### 6.1 Full Screen Layout (80x24 minimum, 160x48 recommended)

```
┌────────────────────────────────────────────────────────────────────────────────┐
│ TRUENO Compute Monitor v0.10.1 │ CPU: Intel Xeon │ GPU: RTX 4090 + W5700X │ F1│
├────────────────────────────────────────────────────────────────────────────────┤
│ [COMPUTE]────────────────────────────────────────────────────────────────────┐ │
│ │ CPU:  ████████████░░░░░░░░ 62.3% │ 3.8 GHz │ 45°C │ 125W / 250W           │ │
│ │ GPU0: ██████████████████░░ 89.1% │ 2.5 GHz │ 72°C │ 320W / 450W  [NVIDIA] │ │
│ │ GPU1: ████████░░░░░░░░░░░░ 34.7% │ 1.8 GHz │ 58°C │ 145W / 200W  [AMD]    │ │
│ │ ▁▂▃▄▅▆▇█▇▆▅▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆▇█▇▆▅▄▃▂▁▂▃▄▅▆ (60s history)     │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ [MEMORY]─────────────────────────────────────────────────────────────────────┐ │
│ │ RAM:   ████████████████████░░░░░░░░░░ 72.4% │ 46.3 / 64.0 GB │ OK         │ │
│ │ SWAP:  ██░░░░░░░░░░░░░░░░░░░░░░░░░░░░  8.2% │  1.3 / 16.0 GB │            │ │
│ │ VRAM0: ██████████████████░░░░░░░░░░░░ 58.3% │ 14.0 / 24.0 GB │ [RTX 4090] │ │
│ │ VRAM1: ████████████░░░░░░░░░░░░░░░░░░ 34.7% │  5.6 / 16.0 GB │ [W5700X]   │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ [DATA FLOW]──────────────────────────────────────────────────────────────────┐ │
│ │ PCIe TX: ██████████████░░░░░░ 12.4 GB/s │ RX: ████████░░░░░░░░░░ 6.8 GB/s │ │
│ │ MEM BW:  ████████████████████░░░░░░░░░░ 672 GB/s (67% of 1008 GB/s peak)  │ │
│ │ Transfers: H→D tensor_a [████████░░] 78% │ D→H result [██░░░░░░] 23%      │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
│ [KERNELS]────────────────────────────────────────────────────────────────────┐ │
│ │ batched_gemm_tiled     GPU0 [████████████░░] 58% │ 2048x2048 │ 8.92 ms    │ │
│ │ tiled_reduction_sum    GPU0 [████░░░░░░░░░░] 14% │ 16x16     │ 0.87 ms    │ │
│ └─────────────────────────────────────────────────────────────────────────────┘ │
├────────────────────────────────────────────────────────────────────────────────┤
│ q:Quit r:Refresh s:Stress Tab:Focus ↑↓:Navigate ?:Help │ Refresh: 100ms       │
└────────────────────────────────────────────────────────────────────────────────┘
```

### 6.2 TUI Widget Specifications

```rust
/// TUI layout configuration (TRUENO-SPEC-024)
pub struct TuiLayout {
    pub min_width: u16,   // 80
    pub min_height: u16,  // 24
    pub rec_width: u16,   // 160
    pub rec_height: u16,  // 48

    pub sections: Vec<Section>,
    pub refresh_rate_ms: u64,
    pub sparkline_points: usize,  // 60
}

pub struct Section {
    pub id: &'static str,
    pub title: String,
    pub height_pct: f32,  // 0.0-1.0
    pub widgets: Vec<Widget>,
}

pub enum Widget {
    Gauge {
        label: String,
        value_pct: f64,
        thresholds: (f64, f64, f64), // warning, critical, max
        suffix: String,
    },
    Sparkline {
        data: VecDeque<f64>,
        label: String,
        baseline: Option<f64>,
    },
    ProgressBar {
        label: String,
        progress: f64,
        total: String,
    },
    Table {
        headers: Vec<String>,
        rows: Vec<Vec<String>>,
        highlight_row: Option<usize>,
    },
    Text {
        content: String,
        style: TextStyle,
    },
}

/// Color scheme (colorblind-safe Viridis-based)
pub struct ColorScheme {
    pub ok: Color,        // #21918c (teal)
    pub warning: Color,   // #fde725 (yellow)
    pub critical: Color,  // #f03b20 (red-orange)
    pub neutral: Color,   // #3b528b (blue)
    pub background: Color,// #440154 (dark purple)
}
```

### 6.3 Keyboard Controls

| Key | Action | Description |
|-----|--------|-------------|
| `q` | Quit | Exit the TUI |
| `r` | Refresh | Force immediate refresh |
| `s` | Stress Test | Toggle stress test mode |
| `Tab` | Focus | Cycle through sections |
| `↑`/`↓` | Navigate | Select rows in tables |
| `Enter` | Expand | Show detailed view |
| `?` | Help | Toggle help overlay |
| `a` | Alerts | Show alert panel |
| `e` | Export | Export metrics to JSON |
| `p` | Pause | Pause/resume monitoring |

---

## 7. Stress Test Mode (--stress-test)



### 7.1 Stress Test Objectives



The `--stress-test` mode saturates all compute and memory resources to validate system stability.

**CRITICAL REQUIREMENT:** GPU stress tests MUST use native compute shaders (WGPU/CUDA/Metal/ROCm) to generate real thermal/power load. CPU-based "fake" GPU load loops are strictly prohibited and will fail QA.



1.  **Thermal throttling behavior** under sustained load

2.  **Memory pressure handling** at 95%+ utilization

3.  **PCIe bandwidth saturation** with concurrent transfers

4.  **Error detection** under resource contention



### 7.2 CLI Interface



```bash

# Full stress test (all resources)

trueno-monitor --stress-test



# Targeted stress tests

trueno-monitor --stress-test --target cpu        # CPU only

trueno-monitor --stress-test --target gpu        # All GPUs (WGPU/CUDA/Metal)

trueno-monitor --stress-test --target gpu:0      # Specific GPU

trueno-monitor --stress-test --target memory     # RAM + VRAM

trueno-monitor --stress-test --target pcie       # PCIe bandwidth



# Duration and intensity

trueno-monitor --stress-test --duration 60s      # 60 second test

trueno-monitor --stress-test --intensity 0.8     # 80% of max load

trueno-monitor --stress-test --ramp-up 10s       # Gradual ramp-up



# Chaos integration (via renacer)

trueno-monitor --stress-test --chaos gentle      # With gentle chaos

trueno-monitor --stress-test --chaos aggressive  # With aggressive chaos

```



### 7.3 Stress Test Implementation



```rust

/// Stress test configuration (TRUENO-SPEC-025)

#[derive(Debug, Clone)]

pub struct StressTestConfig {

    pub target: StressTarget,

    pub duration: Duration,

    pub intensity: f64,        // 0.0-1.0

    pub ramp_up: Duration,

    pub chaos_preset: Option<ChaosPreset>,

    pub collect_metrics: bool,

    pub export_report: bool,

}



#[derive(Debug, Clone)]

pub enum StressTarget {

    All,

    Cpu,

    Gpu(Option<DeviceId>),

    Memory,

    Pcie,

    Custom(Vec<StressTarget>),

}



/// Stress test runner

pub struct StressTestRunner {

    config: StressTestConfig,

    metrics: Arc<Mutex<StressMetrics>>,

    workers: Vec<JoinHandle<()>>,

}



impl StressTestRunner {

    pub async fn run(&mut self) -> Result<StressTestReport> {

        // Phase 1: Ramp-up

        self.ramp_up().await?;



        // Phase 2: Sustained load

        self.sustain_load().await?;



        // Phase 3: Cool-down and report

        self.cool_down().await?;



        Ok(self.generate_report())

    }



    async fn stress_cpu(&self) {

        // SIMD-heavy matrix operations via trueno

        let size = 4096;

        let a = Matrix::<f32>::random(size, size);

        let b = Matrix::<f32>::random(size, size);



        loop {

            let _ = a.matmul_simd(&b);

            if self.should_stop() { break; }

        }

    }



    async fn stress_gpu(&self, device_id: DeviceId) {

        // MUST use WGPU/CUDA compute shader - NO CPU LOOPS

        let kernel = if self.is_cuda() {

             BatchedGemmKernel::tiled(64, 4096, 4096, 4096, 16)

        } else {

             // WGPU/Metal/Vulkan fallback

             WgpuComputeShader::stress_kernel(4096)

        };



        loop {

            self.dispatch_kernel(&kernel).await?;

            if self.should_stop() { break; }

        }

    }



    async fn stress_memory(&self) {

        // Allocate and access memory to prevent caching

        let size = self.config.intensity * available_memory() * 0.9;

        let mut buffers: Vec<Vec<u8>> = Vec::new();



        // Fill to target utilization

        while total_allocated(&buffers) < size {

            let mut buf = vec![0u8; 1024 * 1024 * 64]; // 64MB chunks

            // Touch all pages (prevent lazy allocation)

            for chunk in buf.chunks_mut(4096) {

                chunk[0] = rand::random();

            }

            buffers.push(buf);

        }



        // Random access pattern to stress memory controller

        loop {

            let idx = rand::random::<usize>() % buffers.len();

            let offset = rand::random::<usize>() % buffers[idx].len();

            let _ = buffers[idx][offset];

            if self.should_stop() { break; }

        }

    }



    async fn stress_pcie(&self) {

        // Concurrent H2D and D2H transfers

        let buffer_size = 256 * 1024 * 1024; // 256MB

        let host_buffer = vec![0u8; buffer_size];



        loop {

            // Overlapped transfers for maximum bandwidth

            tokio::join!(

                self.transfer_h2d(&host_buffer),

                self.transfer_d2h(buffer_size),

            );

            if self.should_stop() { break; }

        }

    }

}

```



### 7.4 Stress Test Metrics



```rust

/// Stress test metrics collection

#[derive(Debug, Clone)]

pub struct StressMetrics {

    // Peak values

    pub peak_cpu_utilization: f64,

    pub peak_gpu_utilization: f64,

    pub peak_memory_utilization: f64,

    pub peak_temperature_c: f64,

    pub peak_power_watts: f64,

    pub peak_pcie_bandwidth_gbps: f64,



    // Throttling events

    pub thermal_throttle_count: u32,

    pub power_throttle_count: u32,

    pub memory_pressure_events: u32,



    // Errors

    pub gpu_errors: Vec<GpuError>,

    pub memory_errors: Vec<MemoryError>,

    pub transfer_errors: Vec<TransferError>,



    // Performance regression detection

    pub baseline_flops: f64,

    pub achieved_flops: f64,

    pub performance_degradation_pct: f64,

}



#[derive(Debug, Clone)]

pub struct StressTestReport {

    pub config: StressTestConfig,

    pub metrics: StressMetrics,

    pub duration_actual: Duration,

    pub verdict: StressTestVerdict,

    pub recommendations: Vec<String>,

}



#[derive(Debug, Clone, Copy)]

pub enum StressTestVerdict {

    Pass,           // All metrics within acceptable range

    PassWithNotes,  // Minor throttling, acceptable

    Fail,           // Errors or severe throttling

}

```



---



## 8. Probador Pixel Testing Integration



### 8.1 Pixel Coverage Strategy



Every TUI element is validated pixel-by-pixel using Probar's statistical coverage framework:



```rust

/// Pixel coverage test configuration (PIXEL-001 v2.1)

pub struct TuiPixelCoverage {

    pub tracker: PixelCoverageTracker,

    pub regions: HashMap<String, PixelRegion>,

    pub thresholds: CoverageThresholds,

}



impl TuiPixelCoverage {

    pub fn new(width: u32, height: u32) -> Self {

        let mut tracker = PixelCoverageTracker::new(width, height, 20, 15);



        // Define critical regions

        let regions = hashmap! {

            "header" => PixelRegion::new(0, 0, width, 1),

            "compute_section" => PixelRegion::new(0, 1, width, 5),

            "memory_section" => PixelRegion::new(0, 6, width, 4),

            "dataflow_section" => PixelRegion::new(0, 10, width, 3),

            "kernels_section" => PixelRegion::new(0, 13, width, 3),

            "footer" => PixelRegion::new(0, height - 1, width, 1),

        };



        Self {

            tracker,

            regions,

            thresholds: CoverageThresholds::default(),

        }

    }



    /// Validate 100% pixel coverage

    pub fn validate_full_coverage(&self) -> Result<()> {

        let coverage = self.tracker.coverage_percent();



        if coverage < 100.0 {

            let gaps = self.tracker.find_gaps();

            return Err(PixelCoverageError::IncompleteCoverage {

                actual: coverage,

                expected: 100.0,

                gaps,

            });

        }



        Ok(())

    }



    /// Export heatmap for QA review

    pub fn export_heatmap(&self, path: &Path) -> Result<()> {

        PngHeatmap::new(self.tracker.width(), self.tracker.height())

            .with_palette(ColorPalette::viridis())

            .with_gap_highlighting()

            .with_legend()

            .with_title("TUI Compute Monitor Pixel Coverage")

            .export_to_file(self.tracker.cells(), path)

    }

}

```



### 8.2 Widget-Level Testing



```rust

/// Test each widget renders correctly

#[cfg(test)]

mod pixel_tests {

    use probar::prelude::*;

    use super::*;



    #[test]

    fn test_gauge_widget_coverage() {

        let mut coverage = TuiPixelCoverage::new(80, 24);

        let backend = TestBackend::new(80, 24);

        let mut terminal = Terminal::new(backend)?;



        terminal.draw(|f| {

            let gauge = Gauge::default()

                .percent(75)

                .label("CPU: 75%");

            f.render_widget(gauge, f.size());

        })?;



        // Record all rendered cells

        for (x, y, cell) in terminal.backend().buffer().cells() {

            if cell.symbol() != " " {

                coverage.tracker.record_point(x, y);

            }

        }



        assert!(coverage.tracker.coverage_percent() > 95.0);

    }



    #[test]

    fn test_sparkline_widget_coverage() {

        let mut coverage = TuiPixelCoverage::new(80, 24);

        let data: Vec<u64> = (0..60).map(|i| (i * 100 / 60) as u64).collect();



        let backend = TestBackend::new(80, 24);

        let mut terminal = Terminal::new(backend)?;



        terminal.draw(|f| {

            let sparkline = Sparkline::default()

                .data(&data)

                .style(Style::default().fg(Color::Cyan));

            f.render_widget(sparkline, f.size());

        })?;



        // Validate sparkline renders all 60 data points

        let rendered_points = count_non_empty_cells(terminal.backend().buffer());

        assert!(rendered_points >= 60, "Sparkline should render all data points");

    }



    #[test]

    fn test_full_tui_layout_coverage() {

        let mut coverage = TuiPixelCoverage::new(160, 48);

        let app = TuiApp::new_with_mock_data();



        let backend = TestBackend::new(160, 48);

        let mut terminal = Terminal::new(backend)?;



        // Render full UI

        terminal.draw(|f| app.render(f))?;



        // Record all cells

        for (x, y, cell) in terminal.backend().buffer().cells() {

            coverage.tracker.record_point(x as u32, y as u32);

        }



        // Export heatmap for QA review

        coverage.export_heatmap(Path::new("coverage_heatmap.png"))?;



        // Validate 100% coverage

        coverage.validate_full_coverage()?;

    }

}

```



### 8.3 Visual Regression Testing



```rust

/// Visual regression tests for TUI

#[cfg(test)]

mod visual_regression_tests {

    use probar::visual_regression::*;



    #[test]

    fn test_tui_visual_stability() {

        let reference = load_reference_snapshot("tui_main_view.png")?;



        // Render current TUI

        let current = render_tui_to_image()?;



        // Compare with multiple metrics

        let ssim = compute_ssim(&reference, &current)?;

        let psnr = compute_psnr(&reference, &current)?;

        let delta_e = compute_ciede2000(&reference, &current)?;



        // Thresholds based on [Wang2004] SSIM research

        assert!(ssim > 0.95, "SSIM should be > 0.95 for visual similarity");

        assert!(psnr > 30.0, "PSNR should be > 30dB for good quality");

        assert!(delta_e < 2.0, "CIEDE2000 ΔE should be < 2.0 for imperceptible diff");

    }

}

```



---



## 9. 100-Point Popperian Falsification Suite



### 9.1 Falsification Philosophy



Following Karl Popper's philosophy of science [Popper1959], each test is designed to **potentially disprove** a hypothesis rather than confirm it. Tests are structured as:



> **H[n]**: [Hypothesis that could be false]

> **Test**: [Action that would reveal falsity]

> **Pass Criterion**: [Observable outcome if hypothesis holds]

> **Falsification**: [Observable outcome if hypothesis fails]



### 9.2 Complete Falsification Test Suite



```yaml

# File: tests/falsification/tui-compute-monitor.yaml

# 100-Point Popperian Falsification Suite for TUI Compute Monitor

# Version: 1.0.0

# Standard: PMAT-TDD-2024 + Iron Lotus Framework



metadata:

  specification: TRUENO-SPEC-020

  coverage_target: 100%

  mutation_target: 80%

  reviewed_by: QA Team Lead

  last_updated: 2026-01-03



# =============================================================================

# SECTION 1: HARDWARE DETECTION (20 points)

# =============================================================================



hardware_detection:

  - id: H001

    hypothesis: "NVIDIA GPU detection returns accurate device name"

    test: "Compare nvml device name with nvidia-smi output"

    pass_criterion: "Names match exactly"

    falsification: "Name mismatch or 'Unknown' returned"

    severity: critical

    points: 2



  - id: H002

    hypothesis: "AMD GPU detection works via ROCm SMI"

    test: "Call rocm_smi::get_name() and verify against rocm-smi CLI"

    pass_criterion: "Names match, no library errors"

    falsification: "rocm-smi-lib returns error or wrong name"

    severity: critical

    points: 2



  - id: H002b

    hypothesis: "Metal backend detects Apple GPUs on macOS"

    test: "Verify wgpu backend is Metal on macOS"

    pass_criterion: "Backend::Metal reported"

    falsification: "Backend::Vulkan/Gl or error"

    severity: critical

    points: 2



  - id: H003

    hypothesis: "CPU core count matches physical reality"

    test: "Compare num_cpus::get() with /proc/cpuinfo"

    pass_criterion: "Core count matches"

    falsification: "Count mismatch (hyperthreading confusion)"

    severity: high

    points: 2



  - id: H004

    hypothesis: "Multi-GPU systems enumerate all devices"

    test: "System with 2+ GPUs, verify all detected"

    pass_criterion: "All GPUs in device list"

    falsification: "Missing GPU or duplicate entries"

    severity: critical

    points: 2



  - id: H005

    hypothesis: "Device enumeration is idempotent"

    test: "Call enumerate() 100 times in sequence"

    pass_criterion: "Same results every time"

    falsification: "Device count or order changes"

    severity: high

    points: 2



  - id: H006

    hypothesis: "Hot-plug GPU detection works"

    test: "Add/remove GPU and re-enumerate"

    pass_criterion: "Device list updates correctly"

    falsification: "Stale device list or crash"

    severity: medium

    points: 2



  - id: H007

    hypothesis: "Device IDs are stable across restarts"

    test: "Record device IDs, restart, compare"

    pass_criterion: "Same device gets same ID"

    falsification: "ID changes without hardware change"

    severity: medium

    points: 2



  - id: H008

    hypothesis: "PCIe topology is correctly identified"

    test: "Verify PCIe generation and width"

    pass_criterion: "Matches lspci output"

    falsification: "Wrong gen/width reported"

    severity: low

    points: 2



  - id: H009

    hypothesis: "Unified device trait works for all backends"

    test: "Call all ComputeDevice methods on CPU/NVIDIA/AMD"

    pass_criterion: "No panics, correct types returned"

    falsification: "Method panics or returns wrong type"

    severity: critical

    points: 2



  - id: H010

    hypothesis: "Device capability detection is accurate"

    test: "Query SM/CU count and verify against spec"

    pass_criterion: "Matches known hardware specs"

    falsification: "Wrong compute unit count"

    severity: medium

    points: 2



# =============================================================================

# SECTION 2: MEMORY METRICS (20 points)

# =============================================================================

memory_metrics:
  - id: H011
    hypothesis: "RAM usage matches /proc/meminfo"
    test: "Compare memory_used_bytes with MemTotal - MemAvailable"
    pass_criterion: "Within 1% of /proc/meminfo"
    falsification: "Deviation > 1%"
    severity: critical
    points: 2

  - id: H012
    hypothesis: "VRAM usage matches nvidia-smi"
    test: "Compare gpu_vram.used_bytes with nvidia-smi query"
    pass_criterion: "Within 1MB"
    falsification: "Deviation > 1MB"
    severity: critical
    points: 2

  - id: H013
    hypothesis: "Swap usage is correctly reported"
    test: "Compare with /proc/swaps and free -m"
    pass_criterion: "Values match"
    falsification: "Swap usage incorrect"
    severity: high
    points: 2

  - id: H014
    hypothesis: "Memory pressure levels trigger correctly"
    test: "Allocate memory until Critical level reached"
    pass_criterion: "Level transitions at correct thresholds"
    falsification: "Wrong level at known utilization"
    severity: critical
    points: 2

  - id: H015
    hypothesis: "Safe job calculation is conservative"
    test: "Run calculated safe_jobs in parallel"
    pass_criterion: "No OOM kill occurs"
    falsification: "OOM killer invoked"
    severity: critical
    points: 2

  - id: H016
    hypothesis: "Memory history sparkline has 60 points"
    test: "Run for 60 seconds at 1Hz, check history length"
    pass_criterion: "Exactly 60 points in VecDeque"
    falsification: "Wrong count or data corruption"
    severity: medium
    points: 2

  - id: H017
    hypothesis: "Pinned memory tracking is accurate"
    test: "Allocate 1GB pinned, verify reported"
    pass_criterion: "pinned_memory_used_bytes increases by ~1GB"
    falsification: "No change or wrong amount"
    severity: high
    points: 2

  - id: H018
    hypothesis: "Memory bandwidth measurement is reasonable"
    test: "Compare with STREAM benchmark results"
    pass_criterion: "Within 20% of STREAM"
    falsification: "Deviation > 20%"
    severity: medium
    points: 2

  - id: H019
    hypothesis: "Per-GPU VRAM is correctly attributed"
    test: "Allocate on GPU0 only, verify GPU1 unchanged"
    pass_criterion: "Only GPU0 VRAM increases"
    falsification: "Wrong GPU shows increase"
    severity: critical
    points: 2

  - id: H020
    hypothesis: "Memory metrics update within 100ms"
    test: "Allocate 100MB, measure time to reflect in UI"
    pass_criterion: "Update visible < 100ms"
    falsification: "Stale data shown > 100ms"
    severity: medium
    points: 2

# =============================================================================
# SECTION 3: COMPUTE METRICS (20 points)
# =============================================================================

compute_metrics:
  - id: H021
    hypothesis: "GPU utilization matches nvidia-smi"
    test: "Run stress kernel, compare utilization"
    pass_criterion: "Within 5% of nvidia-smi"
    falsification: "Deviation > 5%"
    severity: critical
    points: 2

  - id: H022
    hypothesis: "CPU utilization matches top/htop"
    test: "Run CPU stress, compare with top"
    pass_criterion: "Within 3% of top"
    falsification: "Deviation > 3%"
    severity: high
    points: 2

  - id: H023
    hypothesis: "Temperature readings are in Celsius"
    test: "Verify GPU temp is 30-90°C range under load"
    pass_criterion: "Realistic temperature values"
    falsification: "Impossible values (e.g., 300°C)"
    severity: critical
    points: 2

  - id: H024
    hypothesis: "Power readings are in Watts"
    test: "Verify power is 10-500W range for GPU"
    pass_criterion: "Reasonable power values"
    falsification: "Impossible values (e.g., 10000W)"
    severity: high
    points: 2

  - id: H025
    hypothesis: "Throttling detection works"
    test: "Force thermal throttle, verify detected"
    pass_criterion: "ThrottleReason::Thermal reported"
    falsification: "No throttle detected"
    severity: high
    points: 2

  - id: H026
    hypothesis: "Clock speed is correctly reported"
    test: "Compare with nvidia-smi -q"
    pass_criterion: "Within 50 MHz"
    falsification: "Deviation > 50 MHz"
    severity: medium
    points: 2

  - id: H027
    hypothesis: "SM/CU active count is dynamic"
    test: "Run partial workload, verify < 100% active"
    pass_criterion: "Active SM < total SM"
    falsification: "Always shows 100% or 0%"
    severity: medium
    points: 2

  - id: H028
    hypothesis: "FLOPS calculation is accurate"
    test: "Run known GEMM, compute achieved FLOPS"
    pass_criterion: "Within 10% of manual calculation"
    falsification: "Deviation > 10%"
    severity: high
    points: 2

  - id: H029
    hypothesis: "Compute efficiency percentage is valid"
    test: "Verify 0% <= efficiency <= 100%"
    pass_criterion: "Value in valid range"
    falsification: "Value < 0% or > 100%"
    severity: critical
    points: 2

  - id: H030
    hypothesis: "Kernel execution tracking is accurate"
    test: "Run 10 kernels, verify all tracked"
    pass_criterion: "10 entries in active_kernels"
    falsification: "Missing or extra kernel entries"
    severity: high
    points: 2

# =============================================================================
# SECTION 4: DATA FLOW TRACKING (15 points)
# =============================================================================

data_flow:
  - id: H031
    hypothesis: "PCIe bandwidth measurement is accurate"
    test: "Transfer 1GB, measure time, calculate bandwidth"
    pass_criterion: "Within 10% of theoretical"
    falsification: "Deviation > 10%"
    severity: high
    points: 2

  - id: H032
    hypothesis: "Transfer direction is correctly identified"
    test: "Do H2D transfer, verify direction = HostToDevice"
    pass_criterion: "Correct direction enum"
    falsification: "Wrong direction"
    severity: critical
    points: 2

  - id: H033
    hypothesis: "Transfer progress percentage is accurate"
    test: "Mid-transfer, verify progress"
    pass_criterion: "Progress = transferred / total"
    falsification: "Wrong percentage"
    severity: high
    points: 2

  - id: H034
    hypothesis: "Concurrent transfers are tracked"
    test: "Start 3 overlapped transfers"
    pass_criterion: "All 3 in active_transfers"
    falsification: "Missing transfers"
    severity: high
    points: 1

  - id: H035
    hypothesis: "Completed transfers move to history"
    test: "Complete transfer, check completed_transfers"
    pass_criterion: "Transfer in completed queue"
    falsification: "Transfer lost"
    severity: medium
    points: 1

  - id: H036
    hypothesis: "Memory bus utilization is bounded 0-100%"
    test: "Check utilization under various loads"
    pass_criterion: "0% <= util <= 100%"
    falsification: "Out of bounds value"
    severity: critical
    points: 2

  - id: H037
    hypothesis: "PCIe generation/width is correctly detected"
    test: "Compare with lspci -vv"
    pass_criterion: "Matches lspci output"
    falsification: "Wrong PCIe config"
    severity: medium
    points: 1

  - id: H038
    hypothesis: "Transfer labeling works"
    test: "Create transfer with label, verify retrieved"
    pass_criterion: "Label preserved"
    falsification: "Label lost or corrupted"
    severity: low
    points: 1

  - id: H039
    hypothesis: "Transfer timing is microsecond-accurate"
    test: "Measure known transfer, verify duration"
    pass_criterion: "Within 10% of wall-clock"
    falsification: "Timing significantly off"
    severity: medium
    points: 1

  - id: H040
    hypothesis: "Peer-to-peer transfers are detected"
    test: "D2D transfer between GPUs"
    pass_criterion: "Direction = DeviceToDevice"
    falsification: "Wrong direction"
    severity: high
    points: 2

  - id: H041
    hypothesis: "History queues maintain 60-point limit"
    test: "Run for 120 seconds, check queue length"
    pass_criterion: "Queue length = 60 (FIFO)"
    falsification: "Queue grows unbounded"
    severity: medium
    points: 2

# =============================================================================
# SECTION 5: TUI RENDERING (15 points)
# =============================================================================

tui_rendering:
  - id: H042
    hypothesis: "TUI renders at minimum 80x24"
    test: "Render on 80x24 terminal"
    pass_criterion: "No truncation or overflow"
    falsification: "Content cut off or panics"
    severity: critical
    points: 2

  - id: H043
    hypothesis: "TUI scales to 160x48 with more detail"
    test: "Render on 160x48 terminal"
    pass_criterion: "Additional detail visible"
    falsification: "Same as 80x24 or broken"
    severity: medium
    points: 1

  - id: H044
    hypothesis: "Gauge widgets show correct percentage"
    test: "Set CPU to 75%, verify gauge"
    pass_criterion: "Gauge shows 75%"
    falsification: "Wrong percentage displayed"
    severity: high
    points: 2

  - id: H045
    hypothesis: "Sparklines render all 60 data points"
    test: "Provide 60-point dataset"
    pass_criterion: "60 bars visible in sparkline"
    falsification: "Missing data points"
    severity: high
    points: 2

  - id: H046
    hypothesis: "Color scheme is colorblind-safe"
    test: "Simulate deuteranopia on screenshot"
    pass_criterion: "All elements distinguishable"
    falsification: "Critical info lost in simulation"
    severity: medium
    points: 1

  - id: H047
    hypothesis: "Keyboard navigation works"
    test: "Press Tab 10 times"
    pass_criterion: "Focus cycles through sections"
    falsification: "Focus stuck or skips section"
    severity: high
    points: 2

  - id: H048
    hypothesis: "Help overlay toggles with ?"
    test: "Press ? twice"
    pass_criterion: "Overlay appears then disappears"
    falsification: "Overlay stuck or missing"
    severity: medium
    points: 1

  - id: H049
    hypothesis: "Refresh rate is configurable"
    test: "Set --refresh-rate 50ms"
    pass_criterion: "Updates occur every ~50ms"
    falsification: "Update rate unchanged"
    severity: low
    points: 1

  - id: H050
    hypothesis: "Unicode characters render correctly"
    test: "Verify box-drawing and block chars"
    pass_criterion: "All chars display properly"
    falsification: "Garbled or missing chars"
    severity: high
    points: 1

  - id: H051
    hypothesis: "TUI handles terminal resize"
    test: "Resize terminal during operation"
    pass_criterion: "Layout adapts without crash"
    falsification: "Crash or frozen display"
    severity: high
    points: 2

# =============================================================================
# SECTION 6: STRESS TEST MODE (10 points)
# =============================================================================

stress_test:
  - id: H052
    hypothesis: "Stress test saturates CPU to >95%"
    test: "Run --stress-test --target cpu"
    pass_criterion: "CPU utilization > 95%"
    falsification: "CPU stays below 95%"
    severity: high
    points: 2

  - id: H053
    hypothesis: "Stress test saturates GPU to >90%"
    test: "Run --stress-test --target gpu"
    pass_criterion: "GPU utilization > 90%"
    falsification: "GPU stays below 90%"
    severity: high
    points: 2

  - id: H054
    hypothesis: "Memory stress reaches target utilization"
    test: "Run --stress-test --target memory --intensity 0.9"
    pass_criterion: "Memory at 90% +/- 5%"
    falsification: "Memory utilization off target"
    severity: high
    points: 2

  - id: H055
    hypothesis: "Stress test respects duration limit"
    test: "Run --stress-test --duration 10s"
    pass_criterion: "Test completes at ~10s"
    falsification: "Runs longer than 12s or stops early"
    severity: medium
    points: 1

  - id: H056
    hypothesis: "Ramp-up phase is gradual"
    test: "Run --stress-test --ramp-up 5s"
    pass_criterion: "Load increases linearly over 5s"
    falsification: "Instant full load or no ramp"
    severity: low
    points: 1

  - id: H057
    hypothesis: "Chaos integration works"
    test: "Run --stress-test --chaos gentle"
    pass_criterion: "Memory limit applied during stress"
    falsification: "No chaos effects visible"
    severity: medium
    points: 1

  - id: H058
    hypothesis: "Stress report is generated"
    test: "Run stress test to completion"
    pass_criterion: "JSON report written with metrics"
    falsification: "No report or empty report"
    severity: high
    points: 1

# =============================================================================
# SECTION 7: ERROR HANDLING (10 points)
# =============================================================================

error_handling:
  - id: H059
    hypothesis: "Missing NVIDIA driver handled gracefully"
    test: "Run on system without nvidia-smi"
    pass_criterion: "Falls back to CPU-only mode"
    falsification: "Crash or panic"
    severity: critical
    points: 2

  - id: H060
    hypothesis: "Missing ROCm handled gracefully"
    test: "Run on system without rocm-smi"
    pass_criterion: "Falls back to NVIDIA/CPU"
    falsification: "Crash or panic"
    severity: critical
    points: 2

  - id: H061
    hypothesis: "GPU driver crash recovery works"
    test: "Simulate driver timeout"
    pass_criterion: "Re-enumeration after recovery"
    falsification: "Stuck with stale state"
    severity: high
    points: 1

  - id: H062
    hypothesis: "Memory allocation failure handled"
    test: "Exhaust memory, attempt allocation"
    pass_criterion: "Error returned, no panic"
    falsification: "Panic or undefined behavior"
    severity: critical
    points: 2

  - id: H063
    hypothesis: "Invalid CLI arguments rejected"
    test: "Pass --invalid-flag"
    pass_criterion: "Usage error with help"
    falsification: "Crash or silent ignore"
    severity: medium
    points: 1

  - id: H064
    hypothesis: "Division by zero protected"
    test: "Device with 0 total memory"
    pass_criterion: "Percentage shows 0% or N/A"
    falsification: "Crash or NaN displayed"
    severity: critical
    points: 2

# =============================================================================
# SECTION 8: PIXEL COVERAGE (10 points) - PROBADOR INTEGRATION
# =============================================================================

pixel_coverage:
  - id: H065
    hypothesis: "Header region is 100% covered"
    test: "Render header, check pixel coverage"
    pass_criterion: "100% of header pixels touched"
    falsification: "Gap in header region"
    severity: high
    points: 1

  - id: H066
    hypothesis: "Compute section is 100% covered"
    test: "Render compute section with all gauges"
    pass_criterion: "100% pixel coverage"
    falsification: "Uncovered pixels"
    severity: high
    points: 1

  - id: H067
    hypothesis: "Memory section is 100% covered"
    test: "Render memory section with all bars"
    pass_criterion: "100% pixel coverage"
    falsification: "Uncovered pixels"
    severity: high
    points: 1

  - id: H068
    hypothesis: "Data flow section is 100% covered"
    test: "Render data flow with active transfers"
    pass_criterion: "100% pixel coverage"
    falsification: "Uncovered pixels"
    severity: high
    points: 1

  - id: H069
    hypothesis: "Kernels section is 100% covered"
    test: "Render with 5 active kernels"
    pass_criterion: "100% pixel coverage"
    falsification: "Uncovered pixels"
    severity: high
    points: 1

  - id: H070
    hypothesis: "Footer/help region is 100% covered"
    test: "Render footer with all key hints"
    pass_criterion: "100% pixel coverage"
    falsification: "Uncovered pixels"
    severity: medium
    points: 1

  - id: H071
    hypothesis: "SSIM > 0.95 vs reference image"
    test: "Compare render with golden master"
    pass_criterion: "SSIM score > 0.95"
    falsification: "SSIM < 0.95"
    severity: high
    points: 1

  - id: H072
    hypothesis: "PSNR > 30dB vs reference image"
    test: "Compare render with golden master"
    pass_criterion: "PSNR > 30dB"
    falsification: "PSNR < 30dB"
    severity: medium
    points: 1

  - id: H073
    hypothesis: "CIEDE2000 ΔE < 2.0"
    test: "Compare colors with reference"
    pass_criterion: "ΔE < 2.0 (imperceptible)"
    falsification: "ΔE >= 2.0 (visible diff)"
    severity: medium
    points: 1

  - id: H074
    hypothesis: "Heatmap export produces valid PNG"
    test: "Export coverage heatmap"
    pass_criterion: "Valid PNG file, correct dimensions"
    falsification: "Corrupt or wrong-sized PNG"
    severity: low
    points: 1

# =============================================================================
# SECTION 9: PERFORMANCE (10 points)
# =============================================================================

performance:
  - id: H075
    hypothesis: "Metric collection < 10ms overhead"
    test: "Measure time for full metric collection"
    pass_criterion: "Collection completes < 10ms"
    falsification: "Collection takes > 10ms"
    severity: high
    points: 2

  - id: H076
    hypothesis: "TUI rendering < 16ms (60 FPS capable)"
    test: "Measure render time"
    pass_criterion: "Render completes < 16ms"
    falsification: "Render takes > 16ms"
    severity: medium
    points: 2

  - id: H077
    hypothesis: "Memory usage < 50MB steady-state"
    test: "Run for 1 hour, measure RSS"
    pass_criterion: "RSS stays < 50MB"
    falsification: "RSS grows or exceeds 50MB"
    severity: high
    points: 2

  - id: H078
    hypothesis: "No memory leaks over 24h run"
    test: "Run for 24 hours, compare start/end RSS"
    pass_criterion: "RSS difference < 10MB"
    falsification: "RSS grew > 10MB"
    severity: critical
    points: 2

  - id: H079
    hypothesis: "CPU overhead < 2% when idle"
    test: "Run monitor on idle system"
    pass_criterion: "Monitor uses < 2% CPU"
    falsification: "Monitor uses > 2% CPU"
    severity: medium
    points: 2

# =============================================================================
# SECTION 10: INTEGRATION (10 points)
# =============================================================================

integration:
  - id: H080
    hypothesis: "Repartir TUI model compatible"
    test: "Use repartir NodeStatus with this monitor"
    pass_criterion: "Data structures interoperable"
    falsification: "Type mismatches or crashes"
    severity: high
    points: 1

  - id: H081
    hypothesis: "Renacer chaos presets work"
    test: "Apply ChaosConfig::aggressive()"
    pass_criterion: "Memory limits applied"
    falsification: "No effect from chaos config"
    severity: medium
    points: 1

  - id: H082
    hypothesis: "Probar pixel tracker integration"
    test: "Use PixelCoverageTracker with TUI"
    pass_criterion: "Coverage data collected"
    falsification: "Tracker errors or incompatible"
    severity: high
    points: 1

  - id: H083
    hypothesis: "trueno-gpu metrics work"
    test: "Use trueno_gpu::device_info()"
    pass_criterion: "Returns valid DeviceInfo"
    falsification: "Error or wrong info"
    severity: critical
    points: 1

  - id: H084
    hypothesis: "OTLP export works (via renacer)"
    test: "Export metrics to Jaeger"
    pass_criterion: "Spans visible in Jaeger UI"
    falsification: "No spans or export errors"
    severity: medium
    points: 1

  - id: H085
    hypothesis: "JSON export produces valid JSON"
    test: "Export metrics, parse with serde_json"
    pass_criterion: "Valid JSON, all fields present"
    falsification: "Parse error or missing fields"
    severity: high
    points: 1

  - id: H086
    hypothesis: "Lambda Labs memory pressure compatible"
    test: "Use LAMBDA-0002 PressureLevel enum"
    pass_criterion: "Enum values match spec"
    falsification: "Incompatible enum values"
    severity: medium
    points: 1

  - id: H087
    hypothesis: "sysinfo crate compatibility"
    test: "Use sysinfo::System for CPU metrics"
    pass_criterion: "Metrics match sysinfo output"
    falsification: "Discrepancy with sysinfo"
    severity: high
    points: 1

  - id: H088
    hypothesis: "nvml-wrapper compatibility"
    test: "Use nvml-wrapper for NVIDIA metrics"
    pass_criterion: "All NVML calls succeed"
    falsification: "NVML errors"
    severity: critical
    points: 1

  - id: H089
    hypothesis: "rocm-smi-lib compatibility"
    test: "Use rocm-smi-lib for AMD metrics"
    pass_criterion: "All ROCm calls succeed"
    falsification: "ROCm errors"
    severity: critical
    points: 1

# =============================================================================
# SECTION 11: EDGE CASES (10 points)
# =============================================================================

edge_cases:
  - id: H090
    hypothesis: "Zero GPU systems work"
    test: "Run on system with no GPU"
    pass_criterion: "Shows CPU-only metrics"
    falsification: "Crash or missing CPU data"
    severity: critical
    points: 2

  - id: H091
    hypothesis: "100+ core systems work"
    test: "Run on 128-core server"
    pass_criterion: "All cores shown, no overflow"
    falsification: "Core count wrong or UI broken"
    severity: high
    points: 1

  - id: H092
    hypothesis: "1TB+ RAM systems work"
    test: "Run on system with 1.5TB RAM"
    pass_criterion: "Memory shown correctly"
    falsification: "Overflow or wrong values"
    severity: high
    points: 1

  - id: H093
    hypothesis: "4+ GPU systems work"
    test: "Run on 8-GPU server"
    pass_criterion: "All GPUs enumerated and shown"
    falsification: "Missing GPUs or UI overflow"
    severity: high
    points: 1

  - id: H094
    hypothesis: "Mixed NVIDIA+AMD works"
    test: "System with both vendors"
    pass_criterion: "Both GPU types detected"
    falsification: "One vendor missing"
    severity: high
    points: 1

  - id: H095
    hypothesis: "Docker container works"
    test: "Run in Docker with GPU passthrough"
    pass_criterion: "GPU visible inside container"
    falsification: "GPU not detected"
    severity: medium
    points: 1

  - id: H096
    hypothesis: "WSL2 works (Windows Subsystem)"
    test: "Run in WSL2 with GPU support"
    pass_criterion: "GPU visible via WSL"
    falsification: "GPU not detected"
    severity: medium
    points: 1

  - id: H097
    hypothesis: "Minimal terminal (80x24) works"
    test: "Run on exactly 80x24 terminal"
    pass_criterion: "All critical info visible"
    falsification: "Content cut off"
    severity: high
    points: 1

  - id: H098
    hypothesis: "Huge terminal (400x100) works"
    test: "Run on 400x100 terminal"
    pass_criterion: "Layout expands gracefully"
    falsification: "Excessive whitespace or crash"
    severity: low
    points: 0.5

  - id: H099
    hypothesis: "Non-ASCII locale works"
    test: "Run with LANG=ja_JP.UTF-8"
    pass_criterion: "No rendering issues"
    falsification: "Garbled text or crash"
    severity: medium
    points: 0.5

  - id: H100
    hypothesis: "Rapid metric changes handled"
    test: "Oscillate CPU 0-100% at 100Hz"
    pass_criterion: "Display updates smoothly"
    falsification: "Flickering or data corruption"
    severity: medium
    points: 1

# =============================================================================
# SUMMARY
# =============================================================================

summary:
  total_tests: 100
  total_points: 100
  passing_threshold: 85  # Must pass 85+ points
  critical_tests: 22     # Tests that MUST pass (severity: critical)

  section_weights:
    hardware_detection: 20
    memory_metrics: 20
    compute_metrics: 20
    data_flow: 15
    tui_rendering: 15
    stress_test: 10
    error_handling: 10
    pixel_coverage: 10
    performance: 10
    integration: 10
    edge_cases: 10
```

---

## 10. Peer-Reviewed Citations

### 10.1 Toyota Production System & Quality

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Liker2004] | Liker, J.K. (2004). *The Toyota Way: 14 Management Principles*. McGraw-Hill. ISBN 0-07-139231-9 | Iron Lotus Framework principles |
| [Ohno1988] | Ohno, T. (1988). *Toyota Production System: Beyond Large-Scale Production*. Productivity Press. ISBN 0-915299-14-3 | Muda (waste) elimination in telemetry |
| [Shingo1986] | Shingo, S. (1986). *Zero Quality Control: Source Inspection and the Poka-Yoke System*. Productivity Press. ISBN 0-915299-07-0 | Type-safe metrics prevent unit errors |
| [Womack1990] | Womack, J.P., Jones, D.T., Roos, D. (1990). *The Machine That Changed the World*. Free Press. ISBN 0-7432-9979-4 | Lean principles in software |

### 10.2 GPU Computing & Performance

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Volkov2008] | Volkov, V., Demmel, J.W. (2008). "Benchmarking GPUs to Tune Dense Linear Algebra". *SC '08: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing*. DOI: 10.1109/SC.2008.5214359 | Tile size optimization (16x16), memory bandwidth modeling |
| [Harris2007] | Harris, M. (2007). "Optimizing Parallel Reduction in CUDA". *NVIDIA Developer Technology*. | Tiled reduction algorithm |
| [Nickolls2008] | Nickolls, J., Buck, I., Garland, M., Skadron, K. (2008). "Scalable Parallel Programming with CUDA". *ACM Queue*, 6(2), 40-53. DOI: 10.1145/1365490.1365500 | CUDA programming model |
| [Jia2018] | Jia, Z., Maggioni, M., Staiger, B., Scarpazza, D.P. (2018). "Dissecting the NVIDIA Volta GPU Architecture via Microbenchmarking". *arXiv:1804.06826* | GPU microarchitecture analysis |

### 10.3 Memory Systems & Pressure

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Hennessy2017] | Hennessy, J.L., Patterson, D.A. (2017). *Computer Architecture: A Quantitative Approach* (6th ed.). Morgan Kaufmann. ISBN 978-0128119051 | Memory hierarchy model |
| [McCalpin1995] | McCalpin, J.D. (1995). "STREAM: Sustainable Memory Bandwidth in High Performance Computers". *Technical Report*, University of Virginia. | Memory bandwidth benchmarking |
| [Drepper2007] | Drepper, U. (2007). "What Every Programmer Should Know About Memory". *Red Hat, Inc.* | Memory access patterns |

### 10.4 Testing & Falsification

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Popper1959] | Popper, K. (1959). *The Logic of Scientific Discovery*. Hutchinson. ISBN 0-415-27844-9 | Falsification test methodology |
| [Claessen2000] | Claessen, K., Hughes, J. (2000). "QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs". *ACM SIGPLAN Notices*, 35(9), 268-279. DOI: 10.1145/351240.351266 | Property-based testing foundation |
| [Regehr2012] | Regehr, J., Chen, Y., Cuoq, P., Eide, E., Ellison, C., Yang, X. (2012). "Test-Case Reduction for C Compiler Bugs". *PLDI '12*. DOI: 10.1145/2254064.2254104 | Test minimization |

### 10.5 Visual Quality Metrics

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Wang2004] | Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P. (2004). "Image Quality Assessment: From Error Visibility to Structural Similarity". *IEEE Transactions on Image Processing*, 13(4), 600-612. DOI: 10.1109/TIP.2003.819861 | SSIM metric for visual regression |
| [Sharma2005] | Sharma, G., Wu, W., Dalal, E.N. (2005). "The CIEDE2000 Color-Difference Formula". *Color Research & Application*, 30(1), 21-30. DOI: 10.1002/col.20070 | CIEDE2000 color difference |

### 10.6 Distributed Systems & Monitoring

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Blumofe1999] | Blumofe, R.D., Leiserson, C.E. (1999). "Scheduling Multithreaded Computations by Work Stealing". *Journal of the ACM*, 46(5), 720-748. DOI: 10.1145/324133.324234 | Work-stealing scheduler in repartir |
| [Burns2016] | Burns, B. (2016). *Designing Distributed Systems*. O'Reilly Media. ISBN 978-1491983645 | Distributed monitoring patterns |
| [Beyer2016] | Beyer, B., Jones, C., Petoff, J., Murphy, N.R. (2016). *Site Reliability Engineering*. O'Reilly Media. ISBN 978-1491929124 | SRE monitoring principles |

### 10.7 Anomaly Detection & Statistics

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Liu2008] | Liu, F.T., Ting, K.M., Zhou, Z.H. (2008). "Isolation Forest". *2008 Eighth IEEE International Conference on Data Mining*. DOI: 10.1109/ICDM.2008.17 | Jidoka anomaly detection algorithm |
| [Anscombe1973] | Anscombe, F.J. (1973). "Graphs in Statistical Analysis". *The American Statistician*, 27(1), 17-21. | Importance of visualization (Anscombe's quartet) |

### 10.8 Data Visualization

| Citation | Reference | Application |
|----------|-----------|-------------|
| [Tufte2006] | Tufte, E.R. (2006). *Beautiful Evidence*. Graphics Press. ISBN 0-9613921-7-7 | Sparklines theory and design |
| [Gregg2013] | Gregg, B. (2013). *Systems Performance: Enterprise and the Cloud*. Prentice Hall. ISBN 978-0133390094 | USE Method (Utilization, Saturation, Errors) |

---

## 11. Implementation Roadmap

### Phase 1: Core Infrastructure (2 weeks)

- [ ] Implement `ComputeDevice` trait
- [ ] Add NVIDIA GPU backend (nvml-wrapper)
- [ ] Add AMD GPU backend (rocm-smi-lib)
- [ ] Add CPU backend (sysinfo)
- [ ] Implement unified telemetry collector
- [ ] Write 40 falsification tests (H001-H040)

### Phase 2: TUI Implementation (2 weeks)

- [ ] Create TUI layout with presentar
- [ ] Implement gauge, sparkline, progress widgets
- [ ] Add keyboard navigation
- [ ] Implement help overlay
- [ ] Write 15 falsification tests (H041-H055)

### Phase 3: Stress Test Mode (1 week)

- [ ] Implement CPU stress worker
- [ ] Implement GPU stress worker (trueno BatchedGemmKernel)
- [ ] Implement memory stress worker
- [ ] Implement PCIe stress worker
- [ ] Add chaos integration (renacer)
- [ ] Write 10 falsification tests (H052-H061)

### Phase 4: Pixel Testing Integration (1 week)

- [ ] Integrate probar PixelCoverageTracker
- [ ] Add visual regression tests
- [ ] Create golden master images
- [ ] Implement heatmap export
- [ ] Write 15 falsification tests (H065-H079)

### Phase 5: Integration & Polish (1 week)

- [ ] Integrate with repartir TUI model
- [ ] Add OTLP export (via renacer)
- [ ] Add JSON export
- [ ] Performance optimization
- [ ] Write 20 falsification tests (H080-H100)

### Phase 6: QA Validation (1 week)

- [ ] Run full 100-point falsification suite
- [ ] Fix any failing tests
- [ ] Generate coverage reports
- [ ] Perform mutation testing
- [ ] Documentation review

---

## Appendix A: Glossary

| Term | Definition |
|------|------------|
| **CU** | Compute Unit (AMD terminology for SM) |
| **FLOPS** | Floating-Point Operations Per Second |
| **H2D** | Host-to-Device (CPU→GPU transfer) |
| **D2H** | Device-to-Host (GPU→CPU transfer) |
| **NVML** | NVIDIA Management Library |
| **PCIe** | Peripheral Component Interconnect Express |
| **ROCm** | Radeon Open Compute (AMD GPU platform) |
| **SM** | Streaming Multiprocessor (NVIDIA terminology) |
| **SSIM** | Structural Similarity Index Measure |
| **VRAM** | Video Random Access Memory (GPU memory) |

---

## Appendix B: File Structure

```
trueno/
├── src/
│   └── bin/
│       └── trueno-monitor.rs      # TUI binary
├── trueno-gpu/
│   └── src/
│       ├── monitor/
│       │   ├── mod.rs             # Monitor module
│       │   ├── nvidia.rs          # NVIDIA backend
│       │   ├── amd.rs             # AMD backend
│       │   ├── cpu.rs             # CPU backend
│       │   └── unified.rs         # Unified collector
│       ├── tui/
│       │   ├── mod.rs             # TUI module
│       │   ├── layout.rs          # Layout spec
│       │   ├── widgets.rs         # Custom widgets
│       │   └── render.rs          # Render logic
│       └── stress/
│           ├── mod.rs             # Stress test module
│           ├── cpu.rs             # CPU stress
│           ├── gpu.rs             # GPU stress
│           ├── memory.rs          # Memory stress
│           └── pcie.rs            # PCIe stress
├── docs/
│   └── specifications/
│       └── tui-compute-mode-flow-cpu-memory.md  # This document
└── tests/
    └── falsification/
        ├── tui-compute-monitor.yaml  # 100-point test suite
        └── pixel_coverage/
            ├── golden_master.png     # Reference image
            └── coverage_tests.rs     # Pixel tests
```

---

**Document History:**

| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0.0 | 2026-01-03 | PAIML Engineering | Initial specification |

**Approval:**

- [ ] Engineering Lead
- [ ] QA Team Lead
- [ ] Product Owner

---

*This specification is validated by 100-point Popperian falsification testing and integrated with Probador pixel-by-pixel coverage analysis.*