gllm
There is very little structured metadata to build this page
from currently. You should check the
main library docs,
readme, or
Cargo.toml
in case the author documented the features in them.
This version has 9 feature flags, 1 of them enabled by default.
default
cpu (default)
cuda
flash-attention
This feature flag does not enable additional features.
gpu-quantized
paged-attention
This feature flag does not enable additional features.
quantized
This feature flag does not enable additional features.