Phi 3 / Phi 4 runner.
Phi-3 and Phi-4 ship as general.architecture = phi3 in their GGUF
converters (Phi-4 reuses the Phi-3 arch tag upstream — there's no
separate phi4 enum in llama.cpp). This crate is a thin wrapper
over [rlx_llama32::Llama32Runner] with arch validation.
Caveat: Phi-3's per-layer LayerNorm placement and partial-RoPE
split aren't yet implemented in rlx-llama32 — runs will produce
some tokens but won't match the upstream reference until those
land. PLAN.md M4 follow-up.