Crate cuttle

Crate cuttle

Expand description

Cuttle - A CPU-based Large Language Model Inference Engine

This crate provides a high-performance inference engine for large language models optimized for CPU execution.

Re-exports§

pub use error::CuttleError;
pub use error::Result;
pub use inference::InferenceConfig;
pub use inference::InferenceEngine;
pub use model::Model;
pub use model::ModelConfig;
pub use tensor::Tensor;
pub use tokenizer::Tokenizer;

Modules§

downloader: Model download module
error: Error handling module
inference: Inference engine module
model: Model definition module
tensor: Tensor operations module
tokenizer: Tokenizer module
utils: Utility module

Structs§

DefaultConfig: Default model configuration

Constants§

VERSION: Version information of the inference engine