Skip to main content

QuantizedTensor

Struct QuantizedTensor 

Source
pub struct QuantizedTensor {
    pub data: Vec<u8>,
    pub shape: Vec<usize>,
    pub params: QuantizationParams,
    pub device: Device,
}
Expand description

Quantized tensor representation

Represents a tensor that has been quantized to a lower-precision format. The data is stored as raw bytes with associated quantization parameters that define how to interpret and convert the data back to floating-point.

Fields§

§data: Vec<u8>

Quantized data stored as raw bytes

The data layout depends on the quantization type:

  • For 8-bit and 16-bit types: one value per element
  • For 4-bit types: two values packed per byte
  • For binary: eight values packed per byte
§shape: Vec<usize>

Original tensor shape

Maintains the logical shape of the tensor for operations. The total number of elements is the product of all dimensions.

§params: QuantizationParams

Quantization parameters

Contains all information needed to convert between quantized and floating-point representations, including scale factors, zero points, and metadata about the quantization scheme.

§device: Device

Device where tensor is stored

Indicates whether the tensor data resides in CPU memory, GPU memory, or other accelerator memory.

Implementations§

Source§

impl QuantizedTensor

Source

pub fn new( shape: Vec<usize>, params: QuantizationParams, device: Device, ) -> Self

Create a new quantized tensor with zero-initialized data

Allocates memory for a quantized tensor with the specified shape and quantization parameters. The data is initialized to zeros.

§Arguments
  • shape - Dimensions of the tensor
  • params - Quantization parameters defining the format
  • device - Target device for tensor storage
§Examples
use torsh_backend::quantization::{QuantizedTensor, QuantizationParams};
use torsh_backend::Device;

let shape = vec![2, 3, 4];
let params = QuantizationParams::int8_symmetric();
let device = Device::cpu().unwrap();
let tensor = QuantizedTensor::new(shape, params, device);
assert_eq!(tensor.num_elements(), 24);
Source

pub fn from_data( data: Vec<u8>, shape: Vec<usize>, params: QuantizationParams, device: Device, ) -> BackendResult<Self>

Create a quantized tensor from existing data

Creates a quantized tensor using pre-existing quantized data. The data length must match the expected size for the given shape and quantization type.

§Arguments
  • data - Pre-quantized data bytes
  • shape - Dimensions of the tensor
  • params - Quantization parameters
  • device - Target device for tensor storage
§Returns

Returns Ok(QuantizedTensor) if the data size matches expectations, or an error if the sizes are incompatible.

Source

pub fn num_elements(&self) -> usize

Get the number of elements in the tensor

Returns the total number of logical elements in the tensor, which is the product of all dimensions in the shape.

§Examples
let tensor = QuantizedTensor::new(vec![2, 3, 4], QuantizationParams::default(), Device::cpu().unwrap());
assert_eq!(tensor.num_elements(), 24);
Source

pub fn memory_usage(&self) -> usize

Get the memory usage in bytes

Returns the actual number of bytes used to store the quantized data. This may be less than num_elements() for sub-byte quantization types.

§Examples
let params = QuantizationParams::int4_symmetric();
let tensor = QuantizedTensor::new(vec![8], params, Device::cpu().unwrap());
assert_eq!(tensor.memory_usage(), 4); // 8 elements, 2 per byte = 4 bytes
Source

pub fn shape(&self) -> &[usize]

Get the shape of the tensor

Returns a reference to the shape vector. This is the logical shape of the tensor, not the storage layout.

Source

pub fn ndim(&self) -> usize

Get the number of dimensions

Returns the number of dimensions (rank) of the tensor.

Source

pub fn is_empty(&self) -> bool

Check if the tensor is empty (has zero elements)

Source

pub fn size(&self, dim: usize) -> BackendResult<usize>

Get the size of a specific dimension

Returns the size of the dimension at the given index, or an error if the index is out of bounds.

Source

pub fn reshape(&self, new_shape: Vec<usize>) -> BackendResult<QuantizedTensor>

Reshape the tensor to a new shape

Returns a new tensor with the same data but a different shape. The total number of elements must remain the same.

§Arguments
  • new_shape - New shape for the tensor
§Returns

Returns Ok(QuantizedTensor) with the new shape, or an error if the total number of elements doesn’t match.

Source

pub fn view( &self, new_shape: Vec<usize>, ) -> BackendResult<QuantizedTensorView<'_>>

Create a view with a new shape (zero-copy reshape)

Similar to reshape, but returns a view that shares the same data. This is more memory-efficient but creates aliasing.

Source

pub fn to_device(&self, device: Device) -> BackendResult<QuantizedTensor>

Move tensor to a different device

Creates a copy of the tensor on the specified device. If the source and destination devices are the same, returns a copy without transfer. For different devices, performs a data transfer and creates a new tensor.

Source

pub fn data_slice(&self, start: usize, len: usize) -> BackendResult<&[u8]>

Get a slice of the raw data

Returns a reference to a portion of the underlying byte data. This is useful for low-level operations and custom kernels.

§Arguments
  • start - Starting byte index
  • len - Number of bytes to include
§Safety

The caller must ensure that the slice boundaries are valid and aligned with the quantization format.

Source

pub fn data_slice_mut( &mut self, start: usize, len: usize, ) -> BackendResult<&mut [u8]>

Get a mutable slice of the raw data

Returns a mutable reference to a portion of the underlying byte data. This allows in-place modifications of the quantized data.

§Arguments
  • start - Starting byte index
  • len - Number of bytes to include
§Safety

The caller must ensure that any modifications maintain the integrity of the quantized representation.

Source

pub fn storage_efficiency(&self) -> f32

Calculate storage efficiency compared to FP32

Returns the ratio of this tensor’s memory usage to what an equivalent FP32 tensor would require.

Source

pub fn compression_ratio(&self) -> f32

Get compression ratio compared to FP32

Returns how many times smaller this tensor is compared to FP32.

Source

pub fn validate(&self) -> BackendResult<()>

Validate tensor consistency

Checks that the tensor’s data size, shape, and parameters are all consistent with each other.

Trait Implementations§

Source§

impl Clone for QuantizedTensor

Source§

fn clone(&self) -> QuantizedTensor

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for QuantizedTensor

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V