Skip to main content

model_name_suggests_vision

Function model_name_suggests_vision 

Source
pub fn model_name_suggests_vision(name: &str) -> bool
Expand description

Heuristic: does this model name look like a vision-capable model?

Used by the TUI’s Ctrl+V image-paste handler to refuse attaching an image when the active model almost certainly can’t accept it (e.g. glm-5.1, deepseek-v4-flash, qwen3-coder). Without this gate the user wastes a turn on a 400 from the upstream — see the ModelArts.81001 message[3].content[0] has invalid field(s): text, type failure pattern that surfaced in production.

Also used by vision_preprocessor::maybe_preprocess to decide whether the active main provider needs preprocessing (vision-capable → skip) and by coding_plan::setup to auto-pick a VL preprocessor from the AtomGit model list.

“OCR” is included because OCR-on-VLM endpoints (PaddleOCR-VL, GOT-OCR, MonkeyOCR, etc.) accept image input via the same OpenAI-compatible image_url schema and are first-class candidates for the vision-preprocessor role.

Conservative — only matches well-known vision/OCR patterns. False-negatives are safe: extend this list when a new vision/OCR model ships rather than threading a per-provider config knob (no user-discoverable opt-in exists). False-positives waste a turn on a 400, so when in doubt this returns false.