Skip to main content

TransposedTable

Struct TransposedTable 

Source
pub struct TransposedTable { /* private fields */ }
Expand description

A PQ table that stores the pivots for each chunk in a miniture block-transpose to facilitate faster compression. The exact layout is not documented (as it is for the BasicTable) because it may be subject to change.

The advantage of this table over the BasicTable is that this table uses a more hardware friendly layout for the pivots, meaning that compression (particularly batch compression) is much faster than the basic table, at the cost of a slightly higher memory footprint.

§Invariants (Dev Docs)

  • pivots.len() == schema.len(): The number of PQ chunks must agree.
  • pivots[i].dimension() == schema.at(i).len() for all inbounds i.
  • pivots[i].total() == ncenters for all inbounds i.
  • largest = max(schema.at(i).len() for i).

Implementations§

Source§

impl TransposedTable

Source

pub fn from_parts( pivots: MatrixView<'_, f32>, offsets: ChunkOffsets, ) -> Result<Self, TransposedTableError>

Construct a new TransposedTable from raw parts.

§Error

Returns an error if

  • pivots.ncols() != offsets.dim(): Pivots must have the dimensionality expected by the offsets.

  • pivots.nrows() == 0: The pivot table cannot be empty.

Source

pub fn ncenters(&self) -> usize

Return the number of pivots in each PQ chunk.

Source

pub fn nchunks(&self) -> usize

Return the number of PQ chunks.

Source

pub fn dim(&self) -> usize

Return the dimensionality of the full-precision vectors associated with this table.

Source

pub fn view_offsets(&self) -> ChunkOffsetsView<'_>

Return a view over the schema offsets.

Source

pub fn compress_batch<T, F, DelegateError>( &self, data: MatrixView<'_, T>, compression_delegate: F, ) -> Result<(), CompressError<DelegateError>>
where T: Copy + Into<f32>, F: FnMut(RowChunk, CompressionResult) -> Result<(), DelegateError>, DelegateError: Error,

Perform PQ compression on the dataset by mapping each chunk in data to its nearest neighbor in the corresponding entry in chunks.

The index of the nearest neighbor is provided to compression_delegate along with its corresponding row in data and chunk index.

Calls to compression_delegate may occur in any order.

Visitor will be invoked for all rows in 0..data.nrows() and all chunks in 0..schema.len().

§Panics

Panics under the following conditions:

  • data.cols() != self.dim(): The number of columns in the source dataset must match the number of dimensions expected by the schema.
Source

pub fn process_into<T>(&self, query: &[f32], partials: MutMatrixView<'_, f32>)
where T: ProcessInto,

Compute the operation defined by T for each chunk in the query on all corresponding pivots for that chunk, storing the result in the output matrix.

For example, this can be used to compute squared L2 distances between chunks of the query and the pivot table to create a fast run-time lookup table for these distances.

This is currently implemented for the following operation types T:

  • quantization::distances::SquaredL2
  • quantization::distances::InnerProduct
§Arguments
  • query: The query slice to process. Must have length self.dim().

  • partials: Output matrix for the partial results. The result of the computation of chunk i against pivot j will be stored into pivots[(i, j)].

    Must have nrows = self.nchunks() and ncols = self.ncenters().

§Panics

Panics if:

  • query.len() != self.dim().
  • partisl.nrows() != self.nchunks().
  • partisl.ncols() != self.ncenters().

Trait Implementations§

Source§

impl CompressInto<&[f32], &mut [u8]> for TransposedTable

Source§

fn compress_into(&self, from: &[f32], to: &mut [u8]) -> Result<(), Self::Error>

Compress the full-precision vector from into the PQ byte buffer to.

Compression is performed by partitioning from into chunks according to the offsets schema in the table and then finding the closest pivot according to the L2 distance.

The final compressed value is the index of the closest pivot.

§Errors

Returns errors under the following conditions:

  • self.ncenters() > 256: If the number of centers exceeds 256, then it cannot be guaranteed that the index of the closest pivot for a chunk will fit losslessly in an 8-bit integer.

  • from.len() != self.dim(): The full precision vector must have the dimensionality expected by the compression.

  • to.len() != self.nchunks(): The PQ buffer must be sized appropriately.

  • If any chunk is sufficiently far from all centers that its distance becomes infinity to all centers.

§Allocates

This function should not allocate when successful.

§Parallelism

This function is single-threaded.

Source§

type Error = TableCompressionError

Errors that may occur during compression.
Source§

type Output = ()

An output type resulting from compression.
Source§

impl<T> CompressInto<MatrixBase<&[T]>, MatrixBase<&mut [u8]>> for TransposedTable
where T: Copy + Into<f32>,

Source§

fn compress_into( &self, from: MatrixView<'_, T>, to: MutMatrixView<'_, u8>, ) -> Result<(), Self::Error>

Compress each full-precision row in from into the corresponding row in to.

Compression is performed by partitioning from into chunks according to the offsets schema in the table and then finding the closest pivot according to the L2 distance.

The final compressed value is the index of the closest pivot.

§Errors

Returns errors under the following conditions:

  • self.ncenters() > 256: If the number of centers exceeds 256, then it cannot be guaranteed that the index of the closest pivot for a chunk will fit losslessly in an 8-bit integer.

  • from.ncols() != self.dim(): The full precision data must have the dimensionality expected by the compression.

  • to.ncols() != self.nchunks(): The PQ buffer must be sized appropriately.

  • from.nrows() == to.nrows(): The input and output buffers must have the same number of elements.

  • If any chunk is sufficiently far from all centers that its distance becomes infinity to all centers.

§Allocates

Allocates scratch memory proportional to the length of the largest chunk.

§Parallelism

This function is single-threaded.

Source§

type Error = TableBatchCompressionError

Errors that may occur during compression.
Source§

type Output = ()

An output type resulting from compression.
Source§

impl Debug for TransposedTable

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ByRef<T> for T

Source§

fn by_ref(&self) -> &T

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> AsyncFriendly for T
where T: Send + Sync + 'static,