Expand description
Lexical analyzer — port of llex.c + llex.h.
Provides the Lua 5.4 lexer: character-by-character scanning of a ZIO
input stream into Token values, with one-token lookahead. The
llex.h header is merged here per PORTING.md §1.
§C source files
reference/lua-5.4.7/src/llex.c(581 lines, 24 functions)reference/lua-5.4.7/src/llex.h(91 lines; merged here)
§Design notes
LexState.L(back-pointer tolua_State) is removed. All functions that needLuaStatereceive it asstate: &mut LuaState.Token.tokenisi32in Phase A (matching the Cint tokenfield). Single-byte tokens are their ASCII values; reserved-word tokens start atFIRST_RESERVED(257). A properTokenKindenum is deferred to Phase B.save/save_and_nextare now fallible (Result<(), LuaError>); the?operator replaces the C noreturnlexerrorcall on buffer overflow.- The
goto read_save / only_save / no_savepattern inread_stringis translated via the localEscapeResultenum.
Structs§
- LexBuffer
- Placeholder for
LexBufferfromlua_vm::zio. TODO(port): replace withuse lua_vm::zio::LexBufferin Phase B. types.tsv: Mbuffer → LexBuffer - LexState
- Per-chunk lexer (and shared parser) state.
- LuaState
- Per-thread Lua execution state.
- LuaString
- LuaTable
- A Lua table: hybrid array + hash map.
- Token
- A single lexed token with its semantic payload.
- ZIO
- Placeholder for
ZIOfromlua_vm::zio. TODO(port): replace withuse lua_vm::zio::ZIOin Phase B. types.tsv: Zio → ZIO
Enums§
- LuaError
- The Lua error type. Carries a
LuaValuepayload because Lua errors can be any value (typically a string). - Token
Value - Semantic payload carried by a token.
Constants§
- EOZ
- End-of-stream sentinel returned by ZIO::getc.
- FIRST_
RESERVED - First token kind value that is not a single-byte character. Single-byte tokens are represented by their ASCII value (0-255).
- LUA_ENV
- Name of the global environment upvalue.
- NUM_
RESERVED - Number of reserved words (keywords).
- TK_AND
and- TK_
BREAK break- TK_
CONCAT ..(concatenation)- TK_
DBCOLON ::- TK_DO
do- TK_DOTS
...(vararg)- TK_ELSE
else- TK_
ELSEIF elseif- TK_END
end- TK_EOS
<eof>- TK_EQ
==- TK_
FALSE false- TK_FLT
<number>(float literal)- TK_FOR
for- TK_
FUNCTION function- TK_GE
>=- TK_GOTO
goto- TK_IDIV
//(floor division)- TK_IF
if- TK_IN
in- TK_INT
<integer>(integer literal)- TK_LE
<=- TK_
LOCAL local- TK_NAME
<name>(identifier)- TK_NE
~=- TK_NIL
nil- TK_NOT
not- TK_OR
or- TK_
REPEAT repeat- TK_
RETURN return- TK_SHL
<<- TK_SHR
>>- TK_
STRING <string>(string literal)- TK_THEN
then- TK_TRUE
true- TK_
UNTIL until- TK_
WHILE while(last keyword; NUM_RESERVED = TK_WHILE - FIRST_RESERVED + 1 = 22)
Statics§
- LUAX_
TOKENS - Display strings for tokens, indexed by
token - FIRST_RESERVED.
Functions§
- init
- Initialise the lexer subsystem: intern all reserved words and fix them in the GC so they are never collected.
- lex_
error - Build a syntax error, optionally annotated with the offending token text.
- lookahead
- Peek at the next token without consuming the current one.
- next
- Consume the current token; load the next one from the stream.
- set_
input - Initialise
lsfor lexing a new chunk from streamz. - syntax_
error - Report a syntax error at the current token.
- token2str
- Produce a human-readable token description (for error messages and the parser).