Expand description
Cuttle - A CPU-based Large Language Model Inference Engine
This crate provides a high-performance inference engine for large language models optimized for CPU execution.
Re-exports§
pub use error::CuttleError;pub use error::Result;pub use inference::InferenceConfig;pub use inference::InferenceEngine;pub use model::Model;pub use model::ModelConfig;pub use tensor::Tensor;pub use tokenizer::Tokenizer;
Modules§
- downloader
- Model download module
- error
- Error handling module
- inference
- Inference engine module
- model
- Model definition module
- tensor
- Tensor operations module
- tokenizer
- Tokenizer module
- utils
- Utility module
Structs§
- Default
Config - Default model configuration
Constants§
- VERSION
- Version information of the inference engine