Struct extsort::sorter::ExternalSorter
source · pub struct ExternalSorter { /* private fields */ }
Expand description
Exposes external sorting (i.e. on-disk sorting) capability on arbitrarily sized iterators, even if the generated content of the iterator doesn’t fit in memory.
It uses an in-memory buffer sorted and flushed to disk in segment files when full. Once sorted, it returns a new sorted iterator with all items. In order to remain efficient for all implementations, the crate doesn’t handle serialization, but leaves that to the user.
Implementations§
source§impl ExternalSorter
impl ExternalSorter
pub fn new() -> ExternalSorter
sourcepub fn with_segment_size(self, size: usize) -> Self
pub fn with_segment_size(self, size: usize) -> Self
Sets the maximum size of each segment in number of sorted items.
This number of items needs to fit in memory. While sorting, an in-memory buffer is used to collect the items to be sorted. Once it reaches the maximum size, it is sorted and then written to disk.
Using a higher segment size makes sorting faster by leveraging faster in-memory operations.
Default is 10000
sourcepub fn with_sort_dir(self, path: PathBuf) -> Self
pub fn with_sort_dir(self, path: PathBuf) -> Self
Sets the directory in which sorted segments will be written (if they don’t fit in memory).
Default is to use the system’s temporary directory.
sourcepub fn with_parallel_sort(self) -> Self
pub fn with_parallel_sort(self) -> Self
Uses Rayon to sort the in-memory buffer.
This may not be needed if the buffer isn’t big enough for parallelism to be beneficial over the overhead of multithreading.
Default is false
sourcepub fn with_heap_iter_segment_count(self, count: usize) -> Self
pub fn with_heap_iter_segment_count(self, count: usize) -> Self
From how many segments on disk should the iterator switch to using a binary heap to keep track of the smallest item from each segment.
For a small amount of segments, it is faster to peek over all segments at each iteration than to maintain a binary heap.
Default is 20
sourcepub fn sort<T, I>(
self,
iterator: I
) -> Result<SortedIterator<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>, Error>
pub fn sort<T, I>( self, iterator: I ) -> Result<SortedIterator<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>, Error>
Sorts a given iterator, returning a new iterator with the sorted items.
sourcepub fn sort_by_key<T, I, F, K>(
self,
iterator: I,
f: F
) -> Result<SortedIterator<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>, Error>
pub fn sort_by_key<T, I, F, K>( self, iterator: I, f: F ) -> Result<SortedIterator<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>, Error>
Sorts a given iterator with a key extraction function, returning a new iterator with the sorted items.
sourcepub fn sort_by<T, I, F>(
self,
iterator: I,
cmp: F
) -> Result<SortedIterator<T, F>, Error>
pub fn sort_by<T, I, F>( self, iterator: I, cmp: F ) -> Result<SortedIterator<T, F>, Error>
Sorts a given iterator with a comparator function, returning a new iterator with the sorted items.
sourcepub fn pushed<T>(
self
) -> PushExternalSorter<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>
pub fn pushed<T>( self ) -> PushExternalSorter<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>
Creates a pushed external sorter, which will consume items in a push pattern and compare them using the default comparator.
sourcepub fn pushed_by<T, F>(self, cmp: F) -> PushExternalSorter<T, F>
pub fn pushed_by<T, F>(self, cmp: F) -> PushExternalSorter<T, F>
Creates a pushed external sorter, which will consume items in a push pattern and compare them using the given comparator function.
sourcepub fn pushed_by_key<T, F, K>(
self,
f: F
) -> PushExternalSorter<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>
pub fn pushed_by_key<T, F, K>( self, f: F ) -> PushExternalSorter<T, impl Fn(&T, &T) -> Ordering + Send + Sync + Clone>
Creates a pushed external sorter, which will consume items in a push pattern and compare them using the given key extraction function.