pub trait CompactionStrategy {
    type S: Debug + Send;

    // Required methods
    fn init<'life0, 'async_trait>(
        &'life0 self,
    ) -> Pin<Box<dyn Future<Output = Self::S> + Send + 'async_trait>>
       where Self: 'async_trait,
             'life0: 'async_trait;
    fn reduce(
        state: &mut Self::S,
        key: Key,
        record: ConsumerRecord,
    ) -> MaxKeysReached;
    fn collect(state: Self::S) -> CompactionMap;

    // Provided method
    fn key(r: &ConsumerRecord) -> Key { ... }
}
Expand description

A compactor strategy’s role is to be fed consumer records for a single topic and ultimately determine, for each record key, what the earliest offset is that may be retained. Upon the consumer completing, logged will then proceed to remove unwanted records from the commit log.

Required Associated Types§

source

type S: Debug + Send

The state to manage throughout a compaction run.

Required Methods§

source

fn init<'life0, 'async_trait>( &'life0 self, ) -> Pin<Box<dyn Future<Output = Self::S> + Send + 'async_trait>>
where Self: 'async_trait, 'life0: 'async_trait,

Produce the initial state for the reducer function.

source

fn reduce( state: &mut Self::S, key: Key, record: ConsumerRecord, ) -> MaxKeysReached

The reducer function receives a mutable state reference, a key and a consumer record, and returns a bool indicating whether the function has reached its maximum number of distinct keys. If it has then compaction may occur again once the current consumption of records has finished. The goal is to avoid needing to process every type of topic/partition/key in one go i.e. a subset can be processed, and then another subset etc. This strategy helps to manage memory.

source

fn collect(state: Self::S) -> CompactionMap

The collect function is responsible for mapping the state into a map of keys and their minimum offsets. The compactor will filter out the first n keys in a subsequent run where n is the number of keys found on the first run. This permits memory to be controlled given a large number of distinct keys. The cost of the re-run strategy is that additional compaction scans will be required, resulting in more I/O.

Provided Methods§

source

fn key(r: &ConsumerRecord) -> Key

The key function computes a key that will be used by the compactor for subsequent use. In simple scenarios, this key can be the key field from the topic’s record itself.

BEWARE!!! It is good practice to encrypt the record’s data. Having a key constructed from record data will expose it to being unencrypted. If this is a problem then override this method to decrypt and the record data each time the Self::reduce function is called. The advantage though of simply using the record’s key directly is speed as we can avoid a decryption stage.

Object Safety§

This trait is not object safe.

Implementors§