pub fn decode_byte_level_token(raw_token: &str) -> Vec<u8> ⓘExpand description
Decode a byte-level BPE token (e.g. "Ġhello") to its raw bytes by
reversing the GPT-2 byte→unicode table. Characters outside the table
fall back to UTF-8 bytes (defensive — shouldn’t happen for valid
vocab entries).