Skip to main content

DEFAULT_BATCH_SIZE

Constant DEFAULT_BATCH_SIZE 

Source
pub const DEFAULT_BATCH_SIZE: usize = 8;
Expand description

Default training batch size.

§⚠ Memory warning

The ViT-B sensor encoder produces N = 2 448 patch tokens per sample. Attention score tensors scale as B × H × chunk × N, so even with attn_chunk_size = 64 a batch of 8 samples at fp32 consumes:

8 × 12 × 64 × 2448 × 4 bytes ≈ 60 MB  per chunk (forward only)

The Burn autodiff tape holds ALL chunk intermediates simultaneously during the backward pass — multiply by ceil(N / chunk) chunks and by depth layers. Keep batch_size ≤ 8 for ViT-B on a 16 GB GPU. Use --cpu with a smaller model config for quick experiments.