Module llm_worker

Expand description

LLM worker thread implementation

Handles LLM inference by proxying requests to the local llama-server process. This is the 1-hop architecture: shared memory state → HTTP to localhost llama-server.