Skip to main content

Module ctype

Module ctype 

Source
Expand description

Lua ctype — character-classification table and predicates.

Ported from reference/lua-5.4.7/src/lctype.c and lctype.h.

Lua ships its own ctype replacements, optimised for its specific needs. These do not match the standard C <ctype.h> semantics exactly; in particular lislalpha / lislalnum treat '_' as alphabetic, and the table is seeded for ASCII byte ranges only (with high bytes left at 0x00 unless LUA_UCID is enabled — see PORT NOTE below).

On ASCII targets (LUA_USE_CTYPE=0, the default) the implementation is a 257-entry byte lookup table. Each entry is a bitfield:

bitnamemeaning
0ALPHABITLua-alphabetic: ASCII letters plus _
1DIGITBITdecimal digit 0-9
2PRINTBITprintable (graph + space)
3SPACEBITwhitespace (ASCII space, TAB, LF, VT, FF, CR)
4XDIGITBIThex digit 0-9, A-F, a-f

test_prop(c, mask) indexes the table as CTYPE_TABLE[(c + 1) as usize], which allows c = -1 (the EOZ end-of-stream sentinel) without underflow.

PORT NOTE: The C code supports a compile-time LUA_UCID flag that sets all non-ASCII bytes (0x80-0xFF, minus invalid UTF-8 sequences) to ALPHABIT so that Unicode identifiers are recognised. That path (NONA = 0x01) is not translated here; only the default NONA = 0x00 path is ported. Enable it in Phase B by introducing a Cargo feature flag.