Expand description
Bindings to the llama.cpp library.
As llama.cpp is a very fast moving target, this crate does not attempt to create a stable API with all the rust idioms. Instead it provided safe wrappers around nearly direct bindings to llama.cpp. This makes it easier to keep up with the changes in llama.cpp, but does mean that the API is not as nice as it could be.
§Feature Flags
cudaenables CUDA gpu support.sampleradds the [context::sample::sampler] struct for a more rusty way of sampling.
Re-exports§
pub use error::ApplyChatTemplateError;pub use error::ChatParseError;pub use error::ChatTemplateError;pub use error::DecodeError;pub use error::EmbeddingsError;pub use error::EncodeError;pub use error::GrammarError;pub use error::LlamaContextLoadError;pub use error::LlamaCppError;pub use error::LlamaLoraAdapterInitError;pub use error::LlamaLoraAdapterRemoveError;pub use error::LlamaLoraAdapterSetError;pub use error::LlamaModelLoadError;pub use error::MetaValError;pub use error::ModelParamsError;pub use error::NewLlamaChatMessageError;pub use error::Result;pub use error::SamplerAcceptError;pub use error::SamplingError;pub use error::StringToTokenError;pub use error::TokenSamplingError;pub use error::TokenToStringError;pub use llama_backend_device::LlamaBackendDevice;pub use llama_backend_device::LlamaBackendDeviceType;pub use llama_backend_device::list_llama_ggml_backend_devices;pub use llama_utility_ggml_time_us::ggml_time_us;pub use llama_utility_json_schema_to_grammar::json_schema_to_grammar;pub use llama_utility_llama_time_us::llama_time_us;pub use llama_utility_max_devices::max_devices;pub use llama_utility_mlock_supported::mlock_supported;pub use llama_utility_mmap_supported::mmap_supported;pub use llama_utility_status_is_ok::status_is_ok;pub use llama_utility_status_to_i32::status_to_i32;pub use log::send_logs_to_tracing;pub use log_options::LogOptions;
Modules§
- context
- Safe wrapper around
llama_context. - error
- llama_
backend - Representation of an initialized llama backend
- llama_
backend_ device - llama_
backend_ numa_ strategy - llama_
batch - Safe wrapper around
llama_batch. - llama_
utility_ ggml_ time_ us - llama_
utility_ json_ schema_ to_ grammar - llama_
utility_ llama_ time_ us - llama_
utility_ max_ devices - llama_
utility_ mlock_ supported - llama_
utility_ mmap_ supported - llama_
utility_ status_ is_ ok - llama_
utility_ status_ to_ i32 - log
- log_
options - model
- A safe wrapper around
llama_model. - openai
OpenAISpecific Utility methods.- sampling
- Safe wrapper around
llama_sampler. - timing
- Safe wrapper around
llama_timings. - token
- Safe wrappers around
llama_token_dataandllama_token_data_array. - token_
type - Utilities for working with
llama_token_typevalues.