Crate llama_cpp_bindings

Expand description

Bindings to the llama.cpp library.

As llama.cpp is a very fast moving target, this crate does not attempt to create a stable API with all the rust idioms. Instead it provided safe wrappers around nearly direct bindings to llama.cpp. This makes it easier to keep up with the changes in llama.cpp, but does mean that the API is not as nice as it could be.

§Feature Flags

cuda enables CUDA gpu support.
sampler adds the [context::sample::sampler] struct for a more rusty way of sampling.

Re-exports§

pub use error::ApplyChatTemplateError;
pub use error::ChatParseError;
pub use error::ChatTemplateError;
pub use error::DecodeError;
pub use error::EmbeddingsError;
pub use error::EncodeError;
pub use error::GrammarError;
pub use error::LlamaContextLoadError;
pub use error::LlamaCppError;
pub use error::LlamaLoraAdapterInitError;
pub use error::LlamaLoraAdapterRemoveError;
pub use error::LlamaLoraAdapterSetError;
pub use error::LlamaModelLoadError;
pub use error::MetaValError;
pub use error::ModelParamsError;
pub use error::NewLlamaChatMessageError;
pub use error::Result;
pub use error::SamplerAcceptError;
pub use error::SamplingError;
pub use error::StringToTokenError;
pub use error::TokenSamplingError;
pub use error::TokenToStringError;
pub use llama_backend_device::LlamaBackendDevice;
pub use llama_backend_device::LlamaBackendDeviceType;
pub use llama_backend_device::list_llama_ggml_backend_devices;
pub use llama_utility_ggml_time_us::ggml_time_us;
pub use llama_utility_json_schema_to_grammar::json_schema_to_grammar;
pub use llama_utility_llama_time_us::llama_time_us;
pub use llama_utility_max_devices::max_devices;
pub use llama_utility_mlock_supported::mlock_supported;
pub use llama_utility_mmap_supported::mmap_supported;
pub use llama_utility_status_is_ok::status_is_ok;
pub use llama_utility_status_to_i32::status_to_i32;
pub use log::send_logs_to_tracing;
pub use log_options::LogOptions;

Modules§

context: Safe wrapper around llama_context.
error
llama_backend: Representation of an initialized llama backend
llama_backend_device
llama_backend_numa_strategy
llama_batch: Safe wrapper around llama_batch.
llama_utility_ggml_time_us
llama_utility_json_schema_to_grammar
llama_utility_llama_time_us
llama_utility_max_devices
llama_utility_mlock_supported
llama_utility_mmap_supported
llama_utility_status_is_ok
llama_utility_status_to_i32
log
log_options
model: A safe wrapper around llama_model.
openai: OpenAI Specific Utility methods.
sampling: Safe wrapper around llama_sampler.
timing: Safe wrapper around llama_timings.
token: Safe wrappers around llama_token_data and llama_token_data_array.
token_type: Utilities for working with llama_token_type values.

Crate llama_cpp_bindings

Crate llama_cpp_bindings Copy item path

§Feature Flags

Re-exports§

Modules§

Crate llama_cpp_bindings