Crate llm

source ·
Expand description

This crate provides a unified interface for loading and using Large Language Models (LLMs). The following models are supported:

At present, the only supported backend is GGML, but this is expected to change in the future.

Example

use std::io::Write;
use llm::Model;

// load a GGML model from disk
let llama = llm::load::<llm::models::Llama>(
    // path to GGML file
    std::path::Path::new("/path/to/model"),
    // llm::ModelParameters
    Default::default(),
    // load progress callback
    llm::load_progress_callback_stdout
)
.unwrap_or_else(|err| panic!("Failed to load model: {err}"));

// use the model to generate text from a prompt
let mut session = llama.start_session(Default::default());
let res = session.infer::<std::convert::Infallible>(
    // model to use for text generation
    &llama,
    // randomness provider
    &mut rand::thread_rng(),
    // the prompt to use for text generation, as well as other
    // inference parameters
    &llm::InferenceRequest {
        prompt: "Rust is a cool programming language because",
        ..Default::default()
    },
    // llm::OutputRequest
    &mut Default::default(),
    // output callback
    |t| {
        print!("{t}");
        std::io::stdout().flush().unwrap();

        Ok(())
    }
);

match res {
    Ok(result) => println!("\n\nInference stats:\n{result}"),
    Err(err) => println!("\n{err}"),
}

Modules

Structs

Enums

Traits

  • Interfaces for creating and interacting with a large language model with a known type of hyperparameters.
  • A type-erased model to allow for interacting with a model without knowing its hyperparameters.

Functions

  • Load a GGML model from the path and configure it per the params. The status of the loading process will be reported through load_progress_callback.
  • A helper function that loads the specified model from disk using an architecture specified at runtime.
  • A implementation for load_progress_callback that outputs to stdout.
  • Quantizes a model.

Type Definitions

  • The identifier of a token in a vocabulary.