pub fn gemm_flops(m: u64, n: u64, k: u64) -> f64
Calculate theoretical FLOPS for a GEMM operation
FLOPS = 2 * M * N * K (for FMA operations)