rage-quant 0.1.0

High-performance quantized GEMV kernels for CPU-only LLM inference. Direct dot product on Q8_0/Q6_K/Q4_K GGUF blocks with AVX2+FMA SIMD — 3.0x decode speedup.

Documentation

rage-quant

There is very little structured metadata to build this page from currently. You should check the main library docs, readme, or Cargo.toml in case the author documented the features in them.

This release does not have any feature flags.