pub struct DictionaryEncoding { /* private fields */ }Expand description
Stores repeated strings efficiently by referencing them with integer codes.
Each unique string appears once in the dictionary. Values are stored as
LE u32 indices pointing into that dictionary, refcounted as
bytes::Bytes so heap-owned and mmap-backed columns share the same
type (revised D7).
Implementations§
Source§impl DictionaryEncoding
impl DictionaryEncoding
Sourcepub fn new(dictionary: Arc<[Arc<str>]>, codes: Vec<u32>) -> Self
pub fn new(dictionary: Arc<[Arc<str>]>, codes: Vec<u32>) -> Self
Creates a new dictionary encoding from a dictionary and codes (legacy
Vec<u32> input).
Sourcepub fn from_bytes_storage(
dictionary: Arc<[Arc<str>]>,
codes_bytes: Bytes,
code_count: usize,
) -> Self
pub fn from_bytes_storage( dictionary: Arc<[Arc<str>]>, codes_bytes: Bytes, code_count: usize, ) -> Self
Constructs a dictionary encoding from pre-encoded bytes (Phase 3c entry point).
codes_bytes must be code_count * 4 bytes of LE u32 values.
Sourcepub fn with_nulls(self, null_bitmap: Vec<u64>) -> Self
pub fn with_nulls(self, null_bitmap: Vec<u64>) -> Self
Adds a null bitmap to this encoding (legacy Vec<u64> input).
Sourcepub fn with_null_bytes(self, null_bitmap: Bytes) -> Self
pub fn with_null_bytes(self, null_bitmap: Bytes) -> Self
Adds a pre-encoded null bitmap (Phase 3c entry point).
Sourcepub fn dictionary_size(&self) -> usize
pub fn dictionary_size(&self) -> usize
Returns the number of unique strings in the dictionary.
Sourcepub fn codes_bytes(&self) -> Bytes
pub fn codes_bytes(&self) -> Bytes
Returns the encoded codes as raw LE u32 bytes (always materialised).
Phase 3b: codes storage is bytes::Bytes. Use Self::code_at for
indexed access; this returns the raw byte storage for serializers
that write the storage out directly.
Sourcepub fn as_codes_slice(&self) -> Option<&[u32]>
pub fn as_codes_slice(&self) -> Option<&[u32]>
Returns a direct &[u32] slice when the codes live in RAM.
Sourcepub fn as_null_words_slice(&self) -> Option<&[u64]>
pub fn as_null_words_slice(&self) -> Option<&[u64]>
Returns a direct &[u64] view of the null bitmap when it lives in
RAM. None when there is no null bitmap or the bitmap is mmap-backed.
Sourcepub fn code_count(&self) -> usize
pub fn code_count(&self) -> usize
Number of u32 codes stored.
Sourcepub fn code_at(&self, idx: usize) -> Option<u32>
pub fn code_at(&self, idx: usize) -> Option<u32>
Returns the code at idx, or None if out of range.
Sourcepub fn codes(&self) -> Vec<u32>
pub fn codes(&self) -> Vec<u32>
Returns the codes as a materialized Vec<u32> (allocates).
Prefer Self::code_at or Self::code_count for reads. This exists
for callers that need a contiguous slice and accept the allocation
(e.g., legacy serialization paths).
Sourcepub fn get(&self, index: usize) -> Option<&str>
pub fn get(&self, index: usize) -> Option<&str>
Returns the string value at the given index.
Returns None if the value is null.
Sourcepub fn iter(&self) -> impl Iterator<Item = Option<&str>>
pub fn iter(&self) -> impl Iterator<Item = Option<&str>>
Iterates over all values, yielding Option<&str>.
Sourcepub fn compression_ratio(&self) -> f64
pub fn compression_ratio(&self) -> f64
Returns the compression ratio (original size / compressed size).
Trait Implementations§
Source§impl Clone for DictionaryEncoding
impl Clone for DictionaryEncoding
Source§fn clone(&self) -> DictionaryEncoding
fn clone(&self) -> DictionaryEncoding
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl !Freeze for DictionaryEncoding
impl RefUnwindSafe for DictionaryEncoding
impl Send for DictionaryEncoding
impl Sync for DictionaryEncoding
impl Unpin for DictionaryEncoding
impl UnsafeUnpin for DictionaryEncoding
impl UnwindSafe for DictionaryEncoding
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more