[][src]Trait hash_roll::Chunk

pub trait Chunk {
    type SearchState;
    fn to_search_state(&self) -> Self::SearchState;
fn find_chunk_edge(
        &self,
        state: &mut Self::SearchState,
        data: &[u8]
    ) -> (Option<usize>, usize); }

Impl on algorthms that define methods of chunking data

This is the lowest level (but somewhat restrictive) trait for chunking algorthms. It assumes that the input is provided to it in a contiguous slice. If you don't have your input as a contiguous slice, ChunkIncr may be a better choice (it allows non-contiguous input, but may be slowing for some chunking algorthms).

Associated Types

type SearchState

SearchState allows searching for the chunk edge to resume without duplicating work already done.

Loading content...

Required methods

fn to_search_state(&self) -> Self::SearchState

Provide an initial [SearchState] for use with [find_chunk_edge()]. Generally, for each input one should generate a new [SearchState].

fn find_chunk_edge(
    &self,
    state: &mut Self::SearchState,
    data: &[u8]
) -> (Option<usize>, usize)

Find the next "chunk" in data to emit

The return value is a pair of a range representing the start and end of the chunk being emitted, and the offset from which subsequent data subsets should be passed to the next call to find_chunk_edge.

state is mutated so that it does not rexamine previously examined data, even when a chunk is not emitted.

data may be extended with additional data between calls to find_chunk_edge(). The bytes that were previously in data and are not indicated by discard_ct must be preserved in the next data buffer called.

use hash_roll::Chunk;

fn some_chunk() -> impl Chunk {
    hash_roll::mii::Mii::default()
}

let chunk = some_chunk();
let orig_data = b"hello";
let mut data = &orig_data[..];
let mut ss = chunk.to_search_state();
let mut prev_cut = 0;

loop {
   let (chunk, discard_ct) = chunk.find_chunk_edge(&mut ss, data);

   match chunk {
       Some(cut_point) => {
           // map `cut_point` from the current slice back into the original slice so we can
           // have consistent indexes
           let g_cut = cut_point + orig_data.len() - data.len();
           println!("chunk: {:?}", &orig_data[prev_cut..cut_point]);
       },
       None => {
           println!("no chunk, done with data we have");
           println!("remain: {:?}", &data[discard_ct..]);
           break;
       }
   }

   data = &data[discard_ct..];
}

Note: call additional times on the same SearchState and the required data to obtain subsequent chunks in the same input data. To handle a seperate input, use a new SearchState.

Note: calling with a previous state with a new data that isn't an extention of the previous data will result in split points that may not follow the design of the underlying algorithm. Avoid relying on consistent cut points to reason about memory safety.

Loading content...

Implementors

impl Chunk for RollSum[src]

type SearchState = RollSumSearchState

impl Chunk for GzipRsyncable[src]

type SearchState = GzipRsyncableSearchState

impl Chunk for Mii[src]

type SearchState = MiiSearchState

impl Chunk for PigzRsyncable[src]

type SearchState = PigzRsyncableSearchState

impl Chunk for Ram[src]

type SearchState = RamState

impl Chunk for Zpaq[src]

type SearchState = ZpaqSearchState

impl Chunk for Zstd[src]

type SearchState = ZstdSearchState

impl<'a> Chunk for FastCdc<'a>[src]

type SearchState = FastCdcState

impl<'a> Chunk for Gear32<'a>[src]

type SearchState = GearState32

impl<H: BuzHashHash + Clone> Chunk for BuzHash<H>[src]

type SearchState = BuzHashSearchState

Loading content...