Expand description
FoldList is a list-like data structure, stored as a tree, where the following operations are all O(log(n)) in the worst case:
- Folding (in a specific pre-defined way)
- View Slicing
- Insertion
- Deletion
- Remove all
- Insert all
A FoldList<T,D,Settings> is parameterized by three types:
T, the type of the elements of the list.D, the type of folds of ranges of the elements. Can be called the “delta” type because it represents how much an element, or range of elements, changes a fold.Settings, a struct of closures which tells theFoldListhow to perform the folds. Will ideally be a ZST.
§Example
Suppose you have a Vec of Strings and you want to find the sum of their lengths. Normally you would write something like this:
strings_vec.iter().fold(0, |sum,str| sum + str.len())To make a new FoldList that does this same fold, you would write:
let fold_list = FoldList::new(|a,b| a+b, |str: &String| str.len(),|| 0);Thereafter, fold_list.fold() will return the fold of the contents of this list, including subsequent modifications. It will currently return 0, because the list is empty.
The type of fold_list is FoldList<String,usize,_>, where _ is some FoldSettings type that uses the three closures’ types.
§Fold Settings
The three closures that make up the Settings type for this list are used as follows:
OP, of typeimpl Fn(D,D)->D, tells the list how to combine two deltas. The “combiningOPeration”- In this example, the total length of two adjacent ranges of strings is just the sum of the two ranges’ total lengths, hence
|a,b| a+b - It is required that this operation be associative, i.e. that, for any
a,b,c:op(a,op(b,c)) = op(op(a,b),c). If this isn’t the case, then folds may be inconsistent. - The operation does NOT need to be commutative (though it can be, as in the example). The order of elements will be correctly kept track of in all folds.
- In this example, the total length of two adjacent ranges of strings is just the sum of the two ranges’ total lengths, hence
DeltaOf, of typeimpl Fn(&T)->D, tells the list how to get the delta of an element- In this example, the total length of a single string is just its length, hence
|str: &String| str.len()
- In this example, the total length of a single string is just its length, hence
Empty, of typeimpl Fn()->D, tells the list what the delta of an empty list should be- In this example, the total length of a list of zero strings is 0, hence
|| 0 - It’s a closure instead of just a value so that Settings can be a ZST more often.
- It is required that the empty delta be both a left-identity and a right-identity with respect to
OP: Let’s call its output0. Then, for anya, it must be the case thatop(0,a) = op(a,0) = a.- If this isn’t the case, then folds may be inconsistent.
- If you can’t find an element with this property for type
D, consider usingOption<D>instead ofD. Then your empty element can beNone. You can reworkOPfor this yourself, or usenew_with_opt.
- In this example, the total length of a list of zero strings is 0, hence
The three closures should behave consistently, and must all be Copy.
It’s possible for the closures to have nameable types; see Nameable Type.
The claim that all the operations on this data structure are O(log(n)) relies on the closures all being O(1).
If they’re instead some O(M) then the operations will be O(M*log(n)).
For example, if D is im::HashSet<T> and OP is set union, which is O(log(n)), then operations which depend on folds (or update them) will be O(log(n)2).
The real guarantee is that each closure will only be called O(log(n))-many times.
§Views
FoldLists are interacted with using views. A view into a FoldList, also called a slice, is anything that implements FoldListSlice.
A mutable slice is one that also implements MutFoldListSlice. Immutable and mutable views work similarly to rust’s &[T] and &mut [T] types.
Every &FoldList is an immutable slice, and every &mut FoldList is a mutable one, though views obtained via view operations will have more complicated types.
Functions which start with view_ or mut_view_ will always only affect your view into the list; never the actual underlying data.
For example, mut_view_drop_left won’t actually remove any elements from the list. It will only consume the current view and return a new one that includes fewer elements.
To actually remove a range of elements from the list, you would instead use take_all, Or for just one element, pop_left.
All functions that return new views will consume the previous ones.
If you want to use the previous view after being done with the new one, consider borrowing it first, using borrow or borrow_mut1. These are the only exceptions to that rule.
Changes to borrowed views will always be reflected in their originals, including insertions and deletions, even if the original or the borrowed view was empty.
If an element of the list is not included in a view, no combination of operations can make it included again. So a borrowed view can never outgrow its original. It will always be the same size, or smaller.
Only one mutable slice can be used at once per underlying list; FoldLists don’t have an analogue of slice::split_at_mut.
If you want to modify two different parts of a FoldList at once, you should instead take_all of one of them into a different FoldList, and then append_all_left (or _right) it back later.
These two operations are both O(log(n)).
You can convert a mutable slice into an immutable one with as_imm, or immutably borrow from one with borrow, but you can never get a mutable slice from an immutable one.
Immutable slices are Clone.
§Slicing
Slicing can be done based on amounts or on folds.
For amounts, it works like one would expect: view.mut_view_drop_left(3) will return a new view that includes all but the 3 leftmost elements of the previous one.
Slicing can also be done by performing a binary search on successive folds. For example:
let mut fold_list = FoldList::from_iter(
|a,b| a+b,
|str| str.len(),
|| 0,
["a","hi","wxyz","bye","syzygy"].into_iter(),
);
let list_view = &mut fold_list;
println!("{:?}",list_view); // ["a","hi","xyz","l","syzygy"]
let list_slice = list_view.mut_view_take_left_until(|length: &usize| *length > 5);
println!("{:?}",list_slice); // ["a","hi"]This looks at the folds that start on the left, and finds the longest one whose contents’ total length doesn’t exceed 5. This results in ["a","hi"] because its total length is 3, but ["a","hi","wxyz"]’s total length is 7. Then it takes this range, that is, restricts the view to be only this range (as opposed to dropping it, which would restrict the view to everything but the range).
Notice how the search criteria was a caller-defined closure. The caller can use any predicate they want, as long as it behaves well under binary search.
The following properties of a predicate p(d: &D) -> bool are all equivalent to it being well-behaved:
- Once a
Dsatisfiesp, there is no otherDthat you can combine it with usingopthat will make it go back to not satisfyingp. - If you mapped
pover the successive folds, the result would look something like[false,false,true,true,true]. That is, there is some way you can cut the list in half such that the first half is all false and the other half is all true. - There is some quantity that
D’s have that can never be decreased throughop, andpis checking whether that quantity meets a certain threshold.
If the predicate isn’t well behaved, then the result will be a cut at an arbitrary one of the many points at which the fold goes from false to true (if there aren’t multiple such points, then the predicate is well-behaved, if not in general then at least in this instance2).
If the predicate always returns false, the range in question will be the entire list. If it always returns true, the range will be the empty range on the side that the folds started from.
A slicing performed this way can either start from the left, or from the right, and it can either take all the elements in question, or it can drop them, which altogether is 4 different possibilities.
Here’s a table of how they all behave for the predicate |length: &usize| *length > 6 on the earlier example of ["a","hi","wxyz","l","syzygy"]:
| Name | Take/Drop | Start | Result |
|---|---|---|---|
mut_view_take_left_until | Take | Left | ["a","hi"] |
mut_view_take_right_until | Take | Right | ["syzygy"] |
mut_view_drop_left_until | Drop | Left | ["wxyz","l","syzygy"] |
mut_view_drop_right_until | Drop | Right | ["a","hi","wxyz","l"] |
You can also use range syntax to more ergonomically specify where to start from, with mut_view_take and mut_view_drop:
| Expression | Result |
|---|---|
fold_list.mut_view_take(..|l: &usize| *l > 6)) | ["a","hi"] |
fold_list.mut_view_take( (|l: &usize| *l > 6).. ) | ["syzygy"] |
fold_list.mut_view_drop(..|l: &usize| *l > 6)) | ["wxyz","l","syzygy"] |
fold_list.mut_view_drop( (|l: &usize| *l > 6).. ) | ["a","hi","wxyz","l"] |
This works with either RangeFrom<P> or RangeTo<P>, where P is your predicate.
Each pair of corresponding Left and Right slicing functions are exact mirror images of each other. That is, running one is equivalent to reversing, running the other, then reversing the result.
For example, for any predicate p, list.mut_view_drop_right_until(p) will always be the same as list.mut_view_reversed().mut_view_drop_left_until(p).mut_view_reversed().
§Insertion
Elements can be inserted at specific indices:
list.insert_at(3,"hello");
//inserts "hello" between the elements formerly at indicies 2 and 3, so that its new index is 3
//and all elements after it have their indices increased by 1list.append_left("left");
list.append_right("right");Inserting at the ends is somewhat faster, because under the hood, inserting in the middle is just taking a slice, and then inserting at the end of the slice.
§Deletion
Similarly, elements can be removed at specific indices:
let removed : T = list.remove_at(3);
//panics if the index was out of boundslet left_unless_list_was_empty : Option<T> = list.pop_left();let right_unless_list_was_empty : Option<T> = list.pop_right(); Deleting at the ends is also faster, for the same reason.
§Mutation / Indexing
Unlike Vec, FoldList doesn’t allow you to borrow direct mutable references to its elements, because if you could, then you could change its folds without it noticing3.
So instead of being able to do this:
let x = my_vec[10].fn_that_mutates();
//secretly makes a mutable reference
//you can't do this with a foldlistYou have to do this:
let x = my_foldlist.update_at(10, |t: &mut T| t.fn_that_mutates());But if you don’t need a mutable reference to the element, you can still do this:
let x = my_foldlist[10].fn_that_does_not_mutate();You can also set/replace elements:
let former_value = my_foldlist.set_at(3, "yeah"); //works like mem::replaceVersions of all of these exist for mutating just the leftmost and rightmost elements as well.
§Reverse
Views can be reversed, which works like you would expect:
let mut fold_list = FoldList::from_iter(
|a: String,b: String | a + &b,
|str| str.clone(),
|| String::new(),
["a","hi","wxyz","l","syzygy"].into_iter().map(|str|String::from(str)),
);
let list_view = &mut fold_list;
println!("{:?}",list_view); //["a", "hi", "wxyz", "l", "syzygy"]
let rev_view = list_view.mut_view_reversed();
println!("{:?}",rev_view); //["syzygy", "l", "wxyz", "hi", "a"]The resulting reversed view works in all the expected ways, with the exception of:
- Folding: reversing a view doesn’t affect its folds
- For example, the above view’s fold (via string concatenation, which isn’t commutative) is still
"ahiwxyzlsyzygy", and not"syzygylwxyzhia", as one might expect. - Another way of looking at it is that reversing a view also reverses its operation, so
op_reversed(a,b) = op(b,a).
- For example, the above view’s fold (via string concatenation, which isn’t commutative) is still
take_all: the resulting list will contain the elements in their original order, not the reversed order.- For example, in the above case,
rev_view.mut_view_take_left(3).take_all()will be["wxyz", "l", "syzygy"], and not["syzygy", "l", "wxyz"].
- For example, in the above case,
append_all_*: the list is appended in its original order- For example, if
list2is["one","two","three"], then after callingrev_view.append_all_left(list2),rev_viewwill be["three", "two", "one", "syzygy", "l", "wxyz", "hi", "a"]. - From the base
FoldList’s point of view, other lists are never appended in a backwards order, but from a reversed view’s perspective, they always are. - Note that the other list is still appended on the correct side, from the reversed view’s perspective.
- For example, if
Those last two sadly mean that FoldLists can’t be used to efficiently implement pancake sort.
§Simplification
A view can have a “simplification” applied to it, which changes the fold that the view performs, to a “simpler” version of the same fold.
For example, suppose that D, the type of folds, is (String,f64), where OP independently concatenates the Strings, and adds the f64s.
Suppose that you want to know just the sum of the f64s in a view:
fn my_f64_sum<'a,T: 'a>(view: impl FoldListSlice<'a,T,(String,f64)>) -> f64 {
todo!()
}One possible solution would be this:
fn my_f64_sum<'a,T: 'a>(view: impl FoldListSlice<'a,T,(String,f64)>) -> f64 {
let (_string_ignored, sum) = view.fold();
sum
}And this does give the correct answer, but it’s way slower than it needs to be, since it involves performing a bunch of string concatenations and possibly making new Strings for each one, and we just throw the result of that away at the end.
We want to only perform the part of the fold involving f64s, and not the part involving Strings.
The solution is to apply a simplification:
fn my_f64_sum<'a,T: 'a>(view: impl FoldListSlice<'a,T,(String,f64)>) -> f64 {
//Assume that the view's operation adds the f64s (This function's type doesn't require this. To do so would overcomplicate this example).
let view_with_simplification = view.view_simplify(
|(_string, float): &(String,f64)| *float,
|(a,b): (f64,f64)| a+b
);
//simplify the view to just the f64s
view_with_simplification.fold()
}This works, and is already much faster. But there’s still a way to make it slightly faster.
Currently, when the simplified view wants to know what the identity element is, it makes the unsimplified version of it, then simplifies that.
So it creates (String::new(),0.0 as f64), and then immediately throws the String part away.
A similar thing happens when it wants to know what the delta of some T is.
To avoid this, we need to provide a faster way to do these two things; a “shortcut”:
fn my_f64_sum<'a,T: 'a>(view: impl FoldListSlice<'a,T,(String,f64)>) -> f64 {
//Assume same as before.
let view_with_simplification = view.view_simplify_with_shortcut(
|(_string, float): &(String,f64)| *float,
|(a,b): (f64,f64)| a+b,
//shortcuts below
|()| 0.0, //fast identity element
|elem: &T| f64_of_element(elem) //fast delta of an element. In a real-world example this function would know what type T is.
);
view_with_simplification.fold()
}In general, the necessary closures for a simplification are as follows:
simplifier, of typeimpl Fn(&D)->D2. Converts aDto its simplified form.- In the above example, we take just the
f64, hence|(_string, float): &(String,f64)| *float
- In the above example, we take just the
simplified_op, of typeimpl Fn((D2,D2)) -> D2. Describes how the original view’s operation behaves under the simplifier.- In the above example, the original view added the
f64s, hence|(a,b): (f64,f64)| a+b. - Needs to act the same way as the original
op, i.e. it must be the case that, for alla,b:
simplifier(&op(a,b)) = simplified_op(simplifier(&a),simplifier(&b)). Otherwise, folds may be inconsistent. - Note that the input is a tuple instead of multiple arguments. This is because its type is actually
impl Fun<(D2,D2),D2>. All of these closures are, in order to allow a simplified slice to have a nameable type.
- In the above example, the original view added the
And if the simplification includes shortcuts:
empty_shortcut, of typeimpl Fn(())->D2. A faster way to make a new empty delta.- In the above example, the
f64part of the empty delta was0.0, hence|()| 0.0. - It must always have the same result as simplifying the previous empty delta. Otherwise, folds may be inconsistent.
- It has a single input of
(), for the same reason as before.
- In the above example, the
delta_shortcut, of typeimpl Fn(&T)->D2. A faster way to get the delta of an element.- In the above example, this would do whatever the previous version did, but without making a new string (the example doesn’t specify).
- It must always have the same result as simplifying the previous delta of the element. Otherwise, folds may be inconsistent.
To keep the simplified view mutable, you can use the mutable versions of the two functions; mut_view_simplify and mut_view_simplify_with_shortcut.
A simplification is actually its own type, similarly to Settings, which implements FoldSimplification.
A view’s currently-applied simplification can be queried with get_current_simplification,
removed/cleared with view_unsimplify,
or explicitly composed/added with view_with_simplification. Mutable versions of all of these also exist.
If a view hasn’t been simplified, its current simplification will be (), which is a FoldSimplification that does nothing.
After a simplification is applied to a view, it will implement FoldListSlice<T,D2>, where D2 is the new delta type.
§Nameable type
Suppose you want to use a FoldList as part of a struct, or return it from a function. What should its type be?
The type of a FoldList or FoldListSlice will always involve its FoldSettings, which will involve its closures,
and in rust, each closure has its own type that cannot be named.
One solution might be to define your own types for each of the closures, and then implement Fn for each of them, but this is currently unstable.
This crate has a workaround: Fun, a trait which works very similarly to Fn, with notable differences:
- The closure must have only one input (but you can use a tuple instead).
- The output type is part of the trait’s type parameters, which means that the same type can implement it it multiple ways, if each way has different output/input types (Whereas
Fncan theoretically only be implemented once per input type).
All of the closure types inside FoldSettingsStruct can also be Fun types4.
Here’s an example of how you would use Fun to make a nameable FoldList type, which can then be used in a struct:
#[derive(Clone,Copy)]
struct MyAddingClosure;
impl foldlist::misc::Fun<(usize,usize),usize> for MyAddingClosure {
fn apply(&self,a: (usize,usize)) -> usize {
a.0 + a.1
}
}
#[derive(Clone,Copy)]
struct MyStringLengthClosure;
impl foldlist::misc::Fun<&String,usize> for MyStringLengthClosure {
fn apply(&self,a: &String) -> usize {
a.len()
}
}
#[derive(Clone,Copy)]
struct MyZeroClosure;
impl foldlist::misc::Fun<(),usize> for MyZeroClosure {
fn apply(&self,_a: ()) -> usize {
0
}
}
pub struct MyStructThatUsesFoldList {
foldlist: FoldList<String,usize,FoldSettingsStruct<String,usize,MyAddingClosure,MyStringLengthClosure,MyZeroClosure>>,
is_cool: bool
}
impl MyStructThatUsesFoldList {
fn new() -> Self {
let fold_list_settings = FoldSettingsStruct {
op_closure: MyAddingClosure,
t2d_closure: MyStringLengthClosure,
empty_closure: MyZeroClosure,
_m: std::marker::PhantomData,
};
Self {
foldlist: FoldList::from_settings(fold_list_settings),
is_cool: true,
}
}
}§FoldChain
A FoldChain is like a FoldList, except it doesn’t keep track of sizes or indices.
A FoldChain can do anything that a FoldList can do, as long as it doesn’t need to know the size (element count) of any folds.
In fact, every FoldList is secretly just a FoldChain that happens to also keep track of size.
Every FoldList with delta type D is built on top of a FoldChain with delta type (usize,D), where the usize in a fold is the number of elements.
This underlying FoldChain can be accessed via as_sized_chain, for a slice, or the public underlying field of FoldList, for the base list.
Conversely, a FoldChain of the proper form can be converted back to a FoldList via as_fold_list.
If you never use a FoldList’s size information, consider using a FoldChain instead.
This can’t be done automatically with
core::ops::Deref, or withcore::borrow::Borrow, since both of them require the result to be a reference, whereas not allFoldListSlices are treated by rust as references (though they should be, conceptually). ↩With the expection of patterns like
[true,true,false,true,true], where there is only one such point, but the predicate is still not well-behaved. You can think of the empty fold as never satisfying any predicate, so there’s always secretly a false at the beginning, i.e.[false,true,true,false,true,true](It is not actually required that your predicate returnfalsefor the empty fold, though. This is true even if it doesn’t). So in this case the returned slice point might also be at the beginning. ↩You are still technically able to do this, by using a type that lets you mutate it via an immutable reference, such as
RefCellor an atomic type. But you shouldn’t, or else folds might be inaccurate. ↩This is also true of
SimplificationWithoutShortcutandSimplificationWithShortcut, so simplified slices can also have named types, though they will be much more complicated. ↩
Modules§
- fold_
chain - Like
fold_list, but without size information; see FoldChain - fold_
list FoldLists and slices thereof; See FoldList.- fold_
settings - The type which tells a
FoldChainorFoldListhow to do its fold; see Fold Settings. - fold_
simplification - The type which tells a
FoldChainorFoldListhow to simplify its fold; see Simplification. - misc
- Miscellaneous things used by
FoldChainandFoldList