Skip to main content

DictionaryCompressor

Struct DictionaryCompressor 

Source
pub struct DictionaryCompressor;
Expand description

Dictionary compression for categorical data

Replaces frequently occurring values with indices into a dictionary. Excellent for high-cardinality categorical columns.

§Examples

let values = vec!["red", "blue", "red", "green", "blue", "red"];
let (compressed, dict) = DictionaryCompressor::compress(&values)?;
// Dictionary: {"red": 0, "blue": 1, "green": 2}
// Compressed: [0, 1, 0, 2, 1, 0]

Implementations§

Source§

impl DictionaryCompressor

Source

pub fn compress_strings( values: &[&str], ) -> Result<(Vec<u8>, HashMap<String, u32>), BinaryFormatError>

Compress string values using dictionary encoding

Returns (compressed indices, dictionary)

Source

pub fn decompress_strings( bytes: &[u8], dictionary: &HashMap<String, u32>, ) -> Result<Vec<String>, BinaryFormatError>

Decompress dictionary-encoded values

Trait Implementations§

Source§

impl Debug for DictionaryCompressor

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.