Module showcase

Expand description

GPU Inference Showcase with PMAT verification (PAR-040)

Benchmark harness for Qwen2.5-Coder showcase demonstrating >2x performance:

trueno GPU PTX generation (persistent kernels, megakernels)
trueno SIMD (AVX2/AVX-512/NEON)
trueno-zram KV cache compression
renacer GPU kernel profiling GPU Inference Showcase Module (PAR-040)

PMAT-verified benchmark harness for Qwen2.5-Coder showcase. Delivers >2x performance vs competitors via:

trueno GPU PTX generation (persistent kernels, megakernels)
trueno SIMD (AVX2/AVX-512/NEON)
trueno-zram KV cache compression
renacer GPU kernel profiling

§Performance Targets (Point 41)

Engine	Target	Mechanism
APR GGUF	>2x llama.cpp	Phase 2 GPU optimizations
APR .apr	>2x Ollama	Native format + ZRAM

§Usage

# Run full showcase benchmark
cargo run --example showcase_benchmark --features cuda

# Run with profiling
renacer trace -- cargo run --example showcase_benchmark --features cuda

Modules§

profiler: Stub module when renacer is not available
zram: Stub module when trueno-zram-core is not available

Structs§

BenchmarkResult: Single benchmark run result
BenchmarkStats: Aggregated benchmark statistics
ComponentTiming: Component timing for profiling
PmatVerification: PMAT verification result
ProfilingCollector: Profiling collector for GPU kernel analysis
ShowcaseConfig: Showcase benchmark configuration
ShowcaseRunner: Main showcase benchmark runner

Module showcase

Module showcase Copy item path

§Performance Targets (Point 41)

§Usage

Modules§

Structs§

Module showcase