llm-samplers
Token samplers for large language models, written in Rust!
Status
Extremely early in development, poorly tested. You can look at src/tests.rs for some examples of use.
Also a fairly simple example of using Mirostat with my RWKV project here: https://github.com/KerfuffleV2/smolrsrwkv/blob/60b8e8bfe64f157f1800445128af3b4adbbc64c1/smolrwkv-cli/src/main.rs#L139-L164
Samplers
Using the term "sampler" here loosely, perhaps it should be renamed in the future. Right now a "sampler" could be something that manipulates the list of logits (for example, a top-k sampler might prune the list to the top K entries), it might actually pick a token or both!
- Flat bias - biases tokens by the specified amount
- Frequency / presence - Applies frequency and presence penalties
- Greedy - picks the token ID with the highest probability
- Locally typical
- Mirostat V1
- Mirostat V2
- Random distribution - picks a token ID based on weighted probabilities
- Repetition - applies a repetition penalty
- Tail free
- Temperature
- Top-K
- Top-P
Real descriptions may (or may not happen) eventually. For now, you can check out the llama.cpp main example README for a brief overview of some of the types of sampler: https://github.com/ggerganov/llama.cpp/blob/master/examples/main/README.md#generation-flags
Example
You probably won't usually want to use individual Samplers. The most typical
use case is going to be chaining a number of samplers together.
A simple example of constructing a [SamplerChain]:
use Result;
use *;
The previous example is simple but not very realistic: the greedy sampler doesn't even care about temperature. Now let's look at something a bit more complicated:
use Result;
use ;
use *;
Links
Note: Crate/docs version likely won't match this repo.
Credits
Initial version closely referenced from the samplers in the llama.cpp project (although not a line-by-line port). Thanks!