Crate kalosm_llama
source ·Expand description
§RLlama
RLlama is a Rust implementation of the quantized Llama 7B language model.
Llama 7B is a very small but performant language model that can be easily run on your local machine.
This library uses Candle to run Llama.
§Usage
use kalosm_llama::prelude::*;
#[tokio::main]
async fn main() {
let mut model = Llama::default();
let prompt = "The capital of France is ";
let mut result = model.stream_text(prompt).await?;
print!("{prompt}");
while let Some(token) = result.next().await {
print!("{token}");
}
}
Modules§
- A prelude of commonly used items in kalosm-llama.
Structs§
- A quantized Llama language model with support for streaming generation.
- A builder with configuration for a Llama model.
- A cache for Llama inference. This cache will speed up generation of sequential text significantly.
- The inner, synchronous Llama model.
- A Llama-1.5 session.
- A source for the Llama model.
Enums§
- A source for a file, either from Hugging Face or a local path