pub fn parse_boxes(text: &str) -> Vec<BoundingBox>Expand description
Parse Qwen2.5-VL grounding spans out of a model text response.
Recognizes the upstream delimiter pairs
(<|object_ref_start|>...<|object_ref_end|> followed by
<|box_start|>(x1,y1),(x2,y2)<|box_end|>) and returns one
BoundingBox per <|box_*|> span. Returns an empty vec when
the output contains no boxes — typical for “just describe”
prompts.
Robust to the common variant where the model emits a bare
<|box_*|> span without a preceding label (the box gets an empty
label), and to extra whitespace inside the coordinate tuple.