pub trait CompactionStrategy {
type S: Debug + Send;
// Required methods
fn init<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Self::S> + Send + 'async_trait>>
where Self: 'async_trait,
'life0: 'async_trait;
fn reduce(
state: &mut Self::S,
key: Key,
record: ConsumerRecord,
) -> MaxKeysReached;
fn collect(state: Self::S) -> CompactionMap;
// Provided method
fn key(r: &ConsumerRecord) -> Key { ... }
}Expand description
A compactor strategy’s role is to be fed consumer records for a single topic and ultimately determine, for each record key, what the earliest offset is that may be retained. Upon the consumer completing, logged will then proceed to remove unwanted records from the commit log.
Required Associated Types§
Required Methods§
sourcefn init<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Self::S> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
fn init<'life0, 'async_trait>(
&'life0 self,
) -> Pin<Box<dyn Future<Output = Self::S> + Send + 'async_trait>>where
Self: 'async_trait,
'life0: 'async_trait,
Produce the initial state for the reducer function.
sourcefn reduce(
state: &mut Self::S,
key: Key,
record: ConsumerRecord,
) -> MaxKeysReached
fn reduce( state: &mut Self::S, key: Key, record: ConsumerRecord, ) -> MaxKeysReached
The reducer function receives a mutable state reference, a key and a consumer record, and returns a bool indicating whether the function has reached its maximum number of distinct keys. If it has then compaction may occur again once the current consumption of records has finished. The goal is to avoid needing to process every type of topic/partition/key in one go i.e. a subset can be processed, and then another subset etc. This strategy helps to manage memory.
sourcefn collect(state: Self::S) -> CompactionMap
fn collect(state: Self::S) -> CompactionMap
The collect function is responsible for mapping the state into a map of keys and their minimum offsets. The compactor will filter out the first n keys in a subsequent run where n is the number of keys found on the first run. This permits memory to be controlled given a large number of distinct keys. The cost of the re-run strategy is that additional compaction scans will be required, resulting in more I/O.
Provided Methods§
sourcefn key(r: &ConsumerRecord) -> Key
fn key(r: &ConsumerRecord) -> Key
The key function computes a key that will be used by the compactor for subsequent use. In simple scenarios, this key can be the key field from the topic’s record itself.
BEWARE!!! It is good practice to encrypt the record’s data. Having a key constructed from record data will expose it to being unencrypted. If this is a problem then override this method to decrypt and the record data each time the Self::reduce function is called. The advantage though of simply using the record’s key directly is speed as we can avoid a decryption stage.