Skip to main content

Module models

Module models 

Source
Expand description

Candle implementations for various deep learning models

This crate provides implementations of popular machine learning models and architectures for different modalities.

Some of the models also have quantized variants, e.g. quantized_blip, quantized_llama and quantized_qwen2.

The implementations aim to be readable while maintaining good performance. For more information on each model see the model’s module docs in the links below.

Modules§

based
Based from the Stanford Hazy Research group.
beit
Based on the BEIT vision-language model.
bert
BERT (Bidirectional Encoder Representations from Transformers)
bigcode
BigCode implementation in Rust based on the GPT-BigCode model.
blip
Based on the BLIP paper from Salesforce Research.
blip_text
Implementation of BLIP text encoder/decoder.
chatglm
Implementation of the ChatGLM2/3 models from THUDM.
chinese_clip
Chinese contrastive Language-Image Pre-Training
clip
Contrastive Language-Image Pre-Training
codegeex4_9b
CodeGeeX4 - A multi-language code generation model
colpali
Colpali Model for text/image similarity scoring.
convmixer
ConvMixer implementation.
convnext
ConvNeXt implementation.
csm
Implementation of the Conversational Speech Model (CSM) from Sesame
dac
Implementation of the Descript Audio Codec (DAC) model
debertav2
deepseek2
depth_anything_v2
Implementation of the Depth Anything model from FAIR.
dinov2
Implementation of the DINOv2 models from Meta Research.
dinov2reg4
Implementation of the DINOv2 revision (4 regularization)
distilbert
Implementation of DistilBert, a distilled version of BERT.
efficientnet
Implementation of EfficientBert, an efficient variant of BERT for computer vision tasks.
efficientvit
EfficientViT (MSRA) inference implementation based on timm.
encodec
EnCodec neural audio codec based on the Encodec implementation.
eva2
EVA-2 inference implementation.
falcon
Falcon language model inference implementation
fastvit
FastViT inference implementation based on timm
flux
Flux Model
gemma
Gemma inference implementation.
gemma2
Gemma LLM architecture (Google) inference implementation.
gemma3
Gemma LLM architecture (Google) inference implementation.
glm4
GLM-4 inference implementation.
glm4_new
granite
Granite is a Long Context Transformer Language Model.
granitemoehybrid
GraniteMoeHybrid is a Long Context Transformer Language Model.
helium
Helium inference implementation.
hiera
Hiera inference implementation based on timm.
jina_bert
JinaBERT inference implementation
llama
Llama inference implementation.
llama2_c
Llama2 inference implementation.
llama2_c_weights
Llama2 inference implementation.
llava
The LLaVA (Large Language and Vision Assistant) model.
mamba
Mamba inference implementation.
mamba2
Mamba2 inference implementation.
marian
Marian Neural Machine Translation
metavoice
MetaVoice Studio ML Models
mimi
mimi model
mistral
Mixtral Model, based on the Mistral architecture
mixformer
MixFormer (Microsoft’s Phi Architecture)
mixtral
Mixtral Model, a sparse mixture of expert model based on the Mistral architecture
mmdit
Mix of Multi-scale Dilated and Traditional Convolutions
mobileclip
Mobile CLIP model, combining a lightweight vision encoder with a text encoder
mobilenetv4
MobileNet-v4
mobileone
MobileOne
modernbert
ModernBERT
moondream
MoonDream Model vision-to-text
mpt
Module implementing the MPT (Multi-Purpose Transformer) model
nomic_bert
NomicBERT
nvembed_v2
NV-Embed-v2
olmo
OLMo (Open Language Model) implementation
olmo2
OLMo 2 (Open Language Model) implementation
openclip
Open Contrastive Language-Image Pre-Training
paddleocr_vl
PaddleOCR-VL Vision-Language Model for OCR.
paligemma
Multimodal multi-purpose model combining Gemma-based language model with SigLIP image understanding
parler_tts
Parler Model implementation for parler_tts text-to-speech synthesis
persimmon
Persimmon Model
phi
Microsoft Phi model implementation
phi3
Microsoft Phi-3 model implementation
pixtral
Pixtral Language-Image Pre-Training
quantized_blip
BLIP model implementation with quantization support.
quantized_blip_text
Quantized BLIP text module implementation.
quantized_gemma3
Gemma 3 model implementation with quantization support.
quantized_glm4
GLM4 implementation with quantization support.
quantized_lfm2
quantized_llama
Quantized llama model implementation.
quantized_llama2_c
Quantized Llama2 model implementation.
quantized_metavoice
Quantized MetaVoice model implementation.
quantized_mistral
Mistral model implementation with quantization support.
quantized_mixformer
Module containing quantized MixFormer model implementation.
quantized_moondream
Implementation of a quantized Moondream vision language model.
quantized_mpt
Quantized MPT model implementation.
quantized_phi
Phi2 model implementation with quantization support.
quantized_phi3
Phi3 model implementation with quantization support.
quantized_qwen2
Qwen2 model implementation with quantization support.
quantized_qwen3
Qwen3 implementation with quantization support.
quantized_qwen3_moe
quantized_recurrent_gemma
Recurrent Gemma model implementation with quantization support.
quantized_rwkv_v5
RWKV v5 model implementation with quantization support.
quantized_rwkv_v6
RWKV v6 model implementation with quantization support.
quantized_stable_lm
Module for quantized StableLM implementation.
quantized_t5
T5 model implementation with quantization support.
qwen2
Qwen2 model implementation with quantization support.
qwen3
qwen2_moe
Qwen2 model implementation with Mixture of Experts support.
qwen3_moe
qwen3_vl
recurrent_gemma
Recurrent Gemma model implementation
repvgg
RepVGG inference implementation
resnet
ResNet Implementation
rwkv_v5
RWKV v5 model implementation.
rwkv_v6
RWKV v6 model implementation.
rwkv_v7
RWKV v7 “Goose” (x070) model implementation.
segformer
Segformer model implementation for semantic segmentation and image classification.
segment_anything
Segment Anything Model (SAM)
siglip
Siglip model implementation.
smol
SmolLM model family implementations.
snac
Implementation of the Multi-Scale Neural Audio Codec (SNAC)
stable_diffusion
Stable Diffusion
stable_lm
StableLM model implementation.
starcoder2
StarCoder model implementation with quantization support.
stella_en_v5
Stella v5 model implementation.
t5
T5 model implementation.
trocr
TrOCR model implementation.
vgg
VGG-16 model implementation.
vit
Vision Transformer (ViT) implementation.
voxtral
whisper
Whisper Model Implementation
with_tracing
wuerstchen
Würstchen Efficient Diffusion Model
xlm_roberta
yi
Yi model implementation.
z_image
Z-Image Model