Shared base types for vision-language and omni runners (PLAN.md M7).
rlx-qwen3-vl, rlx-lfm-vl, and rlx-nemotron-omni all need the
same shape of plumbing: a per-image preprocessor (resize +
patchify), a vision-tower trait, an MLP projector trait, and a
multimodal turn interleaver that mixes image / text / (audio)
into a single LM token stream. This crate hosts those traits so
the family crates stay thin.
Status: TYPE SKELETON. The traits and supporting structs are in place; implementations land alongside the per-family crates as M7 progresses.