pub struct QuantizedTensor {
pub data: Vec<u8>,
pub shape: Vec<usize>,
pub params: QuantizationParams,
pub device: Device,
}Expand description
Quantized tensor representation
Represents a tensor that has been quantized to a lower-precision format. The data is stored as raw bytes with associated quantization parameters that define how to interpret and convert the data back to floating-point.
Fields§
§data: Vec<u8>Quantized data stored as raw bytes
The data layout depends on the quantization type:
- For 8-bit and 16-bit types: one value per element
- For 4-bit types: two values packed per byte
- For binary: eight values packed per byte
shape: Vec<usize>Original tensor shape
Maintains the logical shape of the tensor for operations. The total number of elements is the product of all dimensions.
params: QuantizationParamsQuantization parameters
Contains all information needed to convert between quantized and floating-point representations, including scale factors, zero points, and metadata about the quantization scheme.
device: DeviceDevice where tensor is stored
Indicates whether the tensor data resides in CPU memory, GPU memory, or other accelerator memory.
Implementations§
Source§impl QuantizedTensor
impl QuantizedTensor
Sourcepub fn new(
shape: Vec<usize>,
params: QuantizationParams,
device: Device,
) -> Self
pub fn new( shape: Vec<usize>, params: QuantizationParams, device: Device, ) -> Self
Create a new quantized tensor with zero-initialized data
Allocates memory for a quantized tensor with the specified shape and quantization parameters. The data is initialized to zeros.
§Arguments
shape- Dimensions of the tensorparams- Quantization parameters defining the formatdevice- Target device for tensor storage
§Examples
use torsh_backend::quantization::{QuantizedTensor, QuantizationParams};
use torsh_backend::Device;
let shape = vec![2, 3, 4];
let params = QuantizationParams::int8_symmetric();
let device = Device::cpu().unwrap();
let tensor = QuantizedTensor::new(shape, params, device);
assert_eq!(tensor.num_elements(), 24);Sourcepub fn from_data(
data: Vec<u8>,
shape: Vec<usize>,
params: QuantizationParams,
device: Device,
) -> BackendResult<Self>
pub fn from_data( data: Vec<u8>, shape: Vec<usize>, params: QuantizationParams, device: Device, ) -> BackendResult<Self>
Create a quantized tensor from existing data
Creates a quantized tensor using pre-existing quantized data. The data length must match the expected size for the given shape and quantization type.
§Arguments
data- Pre-quantized data bytesshape- Dimensions of the tensorparams- Quantization parametersdevice- Target device for tensor storage
§Returns
Returns Ok(QuantizedTensor) if the data size matches expectations,
or an error if the sizes are incompatible.
Sourcepub fn num_elements(&self) -> usize
pub fn num_elements(&self) -> usize
Get the number of elements in the tensor
Returns the total number of logical elements in the tensor, which is the product of all dimensions in the shape.
§Examples
let tensor = QuantizedTensor::new(vec![2, 3, 4], QuantizationParams::default(), Device::cpu().unwrap());
assert_eq!(tensor.num_elements(), 24);Sourcepub fn memory_usage(&self) -> usize
pub fn memory_usage(&self) -> usize
Get the memory usage in bytes
Returns the actual number of bytes used to store the quantized data.
This may be less than num_elements() for sub-byte quantization types.
§Examples
let params = QuantizationParams::int4_symmetric();
let tensor = QuantizedTensor::new(vec![8], params, Device::cpu().unwrap());
assert_eq!(tensor.memory_usage(), 4); // 8 elements, 2 per byte = 4 bytesSourcepub fn shape(&self) -> &[usize]
pub fn shape(&self) -> &[usize]
Get the shape of the tensor
Returns a reference to the shape vector. This is the logical shape of the tensor, not the storage layout.
Sourcepub fn ndim(&self) -> usize
pub fn ndim(&self) -> usize
Get the number of dimensions
Returns the number of dimensions (rank) of the tensor.
Sourcepub fn size(&self, dim: usize) -> BackendResult<usize>
pub fn size(&self, dim: usize) -> BackendResult<usize>
Get the size of a specific dimension
Returns the size of the dimension at the given index, or an error if the index is out of bounds.
Sourcepub fn reshape(&self, new_shape: Vec<usize>) -> BackendResult<QuantizedTensor>
pub fn reshape(&self, new_shape: Vec<usize>) -> BackendResult<QuantizedTensor>
Reshape the tensor to a new shape
Returns a new tensor with the same data but a different shape. The total number of elements must remain the same.
§Arguments
new_shape- New shape for the tensor
§Returns
Returns Ok(QuantizedTensor) with the new shape, or an error
if the total number of elements doesn’t match.
Sourcepub fn view(
&self,
new_shape: Vec<usize>,
) -> BackendResult<QuantizedTensorView<'_>>
pub fn view( &self, new_shape: Vec<usize>, ) -> BackendResult<QuantizedTensorView<'_>>
Create a view with a new shape (zero-copy reshape)
Similar to reshape, but returns a view that shares the same data. This is more memory-efficient but creates aliasing.
Sourcepub fn to_device(&self, device: Device) -> BackendResult<QuantizedTensor>
pub fn to_device(&self, device: Device) -> BackendResult<QuantizedTensor>
Move tensor to a different device
Creates a copy of the tensor on the specified device. If the source and destination devices are the same, returns a copy without transfer. For different devices, performs a data transfer and creates a new tensor.
Sourcepub fn data_slice(&self, start: usize, len: usize) -> BackendResult<&[u8]>
pub fn data_slice(&self, start: usize, len: usize) -> BackendResult<&[u8]>
Get a slice of the raw data
Returns a reference to a portion of the underlying byte data. This is useful for low-level operations and custom kernels.
§Arguments
start- Starting byte indexlen- Number of bytes to include
§Safety
The caller must ensure that the slice boundaries are valid and aligned with the quantization format.
Sourcepub fn data_slice_mut(
&mut self,
start: usize,
len: usize,
) -> BackendResult<&mut [u8]>
pub fn data_slice_mut( &mut self, start: usize, len: usize, ) -> BackendResult<&mut [u8]>
Get a mutable slice of the raw data
Returns a mutable reference to a portion of the underlying byte data. This allows in-place modifications of the quantized data.
§Arguments
start- Starting byte indexlen- Number of bytes to include
§Safety
The caller must ensure that any modifications maintain the integrity of the quantized representation.
Sourcepub fn storage_efficiency(&self) -> f32
pub fn storage_efficiency(&self) -> f32
Calculate storage efficiency compared to FP32
Returns the ratio of this tensor’s memory usage to what an equivalent FP32 tensor would require.
Sourcepub fn compression_ratio(&self) -> f32
pub fn compression_ratio(&self) -> f32
Get compression ratio compared to FP32
Returns how many times smaller this tensor is compared to FP32.
Sourcepub fn validate(&self) -> BackendResult<()>
pub fn validate(&self) -> BackendResult<()>
Validate tensor consistency
Checks that the tensor’s data size, shape, and parameters are all consistent with each other.
Trait Implementations§
Source§impl Clone for QuantizedTensor
impl Clone for QuantizedTensor
Source§fn clone(&self) -> QuantizedTensor
fn clone(&self) -> QuantizedTensor
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for QuantizedTensor
impl RefUnwindSafe for QuantizedTensor
impl Send for QuantizedTensor
impl Sync for QuantizedTensor
impl Unpin for QuantizedTensor
impl UnsafeUnpin for QuantizedTensor
impl UnwindSafe for QuantizedTensor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more