pub struct Codebook {
pub version: u32,
pub dimensionality: usize,
pub basis_vectors: Vec<BasisVector>,
pub semantic_markers: Vec<SparseVec>,
pub statistics: CodebookStatistics,
pub salt: Option<[u8; 32]>,
}Expand description
The Codebook - acts as the private key for reconstruction
Fields§
§version: u32Version for compatibility
dimensionality: usizeDimensionality of basis vectors
basis_vectors: Vec<BasisVector>The basis vectors forming the encoding dictionary Data is projected onto these bases
semantic_markers: Vec<SparseVec>Semantic marker vectors for outlier detection
statistics: CodebookStatisticsStatistics for adaptive encoding
salt: Option<[u8; 32]>Cryptographic salt for key derivation (optional)
Implementations§
Source§impl Codebook
impl Codebook
Sourcepub fn with_salt(dimensionality: usize, salt: [u8; 32]) -> Self
pub fn with_salt(dimensionality: usize, salt: [u8; 32]) -> Self
Create a codebook with cryptographic salt for key derivation
Sourcepub fn initialize_standard_basis(&mut self)
pub fn initialize_standard_basis(&mut self)
Initialize with common basis vectors for text/binary data
Sourcepub fn initialize_byte_basis(&mut self)
pub fn initialize_byte_basis(&mut self)
Initialize with byte-level basis vectors (256 basis vectors for each byte value)
This creates a complete basis that can represent any byte data. Each byte value 0-255 gets its own basis vector.
Position basis vectors (64 vectors for positions 0-63) are also added
by default. Use initialize_byte_basis_with_config to control this.
Sourcepub fn initialize_byte_basis_with_config(
&mut self,
include_position_basis: bool,
)
pub fn initialize_byte_basis_with_config( &mut self, include_position_basis: bool, )
Initialize with byte-level basis vectors with optional position basis
§Arguments
include_position_basis- Whether to add position-aware basis vectors (64 vectors)
Sourcepub fn train(
&mut self,
training_data: &[&[u8]],
config: &CodebookTrainingConfig,
) -> usize
pub fn train( &mut self, training_data: &[&[u8]], config: &CodebookTrainingConfig, ) -> usize
Train the codebook on representative data
This learns basis vectors by analyzing patterns in the training data. The algorithm:
- Chunk the data into blocks
- Find frequently occurring patterns (n-grams)
- Create basis vectors for the most common patterns
- Optionally add byte-level basis as fallback
§Arguments
training_data- Slice of training samplesconfig- Training configuration
§Returns
Number of basis vectors learned
§ID Allocation
Basis vector IDs are allocated in non-overlapping ranges:
- Byte basis: 0-255
- Position basis: 256-319
- Learned patterns: 1000+
Sourcepub fn train_from_files(
&mut self,
paths: &[&Path],
config: &CodebookTrainingConfig,
) -> Result<usize>
pub fn train_from_files( &mut self, paths: &[&Path], config: &CodebookTrainingConfig, ) -> Result<usize>
Train codebook from files on disk
Convenience method that reads files and trains on their content.
Sourcepub fn project(&self, data: &[u8]) -> ProjectionResult
pub fn project(&self, data: &[u8]) -> ProjectionResult
Project data onto the codebook basis Returns coefficients, residual, and detected outliers
Sourcepub fn project_with_config(
&self,
data: &[u8],
config: &ProjectionConfig,
) -> ProjectionResult
pub fn project_with_config( &self, data: &[u8], config: &ProjectionConfig, ) -> ProjectionResult
Project data onto the codebook using custom configuration
Sourcepub fn reconstruct(
&self,
projection: &ProjectionResult,
expected_size: usize,
) -> Vec<u8> ⓘ
pub fn reconstruct( &self, projection: &ProjectionResult, expected_size: usize, ) -> Vec<u8> ⓘ
Reconstruct original data from projection result
Trait Implementations§
Source§impl<'de> Deserialize<'de> for Codebook
impl<'de> Deserialize<'de> for Codebook
Source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
Auto Trait Implementations§
impl Freeze for Codebook
impl RefUnwindSafe for Codebook
impl Send for Codebook
impl Sync for Codebook
impl Unpin for Codebook
impl UnwindSafe for Codebook
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more