[][src]Struct differential_dataflow::collection::Collection

pub struct Collection<G: Scope, D, R: Semigroup = isize> {
    pub inner: Stream<G, (D, G::Timestamp, R)>,
}

A mutable collection of values of type D

The Collection type is the core abstraction in differential dataflow programs. As you write your differential dataflow computation, you write as if the collection is a static dataset to which you apply functional transformations, creating new collections. Once your computation is written, you are able to mutate the collection (by inserting and removing elements); differential dataflow will propagate changes through your functional computation and report the corresponding changes to the output collections.

Each collection has three generic parameters. The parameter G is for the scope in which the collection exists; as you write more complicated programs you may wish to introduce nested scopes (e.g. for iteration) and this parameter tracks the scope (for timely dataflow's benefit). The D parameter is the type of data in your collection, for example String, or (u32, Vec<Option<()>>). The R parameter represents the types of changes that the data undergo, and is most commonly (and defaults to) isize, representing changes to the occurrence count of each record.

Fields

inner: Stream<G, (D, G::Timestamp, R)>

The underlying timely dataflow stream.

This field is exposed to support direct timely dataflow manipulation when required, but it is not intended to be the idiomatic way to work with the collection.

Methods

impl<G: Scope, D: Data, R: Semigroup> Collection<G, D, R> where
    G::Timestamp: Data
[src]

pub fn new(stream: Stream<G, (D, G::Timestamp, R)>) -> Collection<G, D, R>[src]

Creates a new Collection from a timely dataflow stream.

This method seems to be rarely used, with the as_collection method on streams being a more idiomatic approach to convert timely streams to collections. Also, the input::Input trait provides a new_collection method which will create a new collection for you without exposing the underlying timely stream at all.

pub fn map<D2, L>(&self, logic: L) -> Collection<G, D2, R> where
    D2: Data,
    L: FnMut(D) -> D2 + 'static, 
[src]

Creates a new collection by applying the supplied function to each input element.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map(|x| x * 2)
             .filter(|x| x % 2 == 1)
             .assert_empty();
    });
}

pub fn map_in_place<L>(&self, logic: L) -> Collection<G, D, R> where
    L: FnMut(&mut D) + 'static, 
[src]

Creates a new collection by applying the supplied function to each input element.

Although the name suggests in-place mutation, this function does not change the source collection, but rather re-uses the underlying allocations in its implementation. The method is semantically equivalent to map, but can be more efficient.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map_in_place(|x| *x *= 2)
             .filter(|x| x % 2 == 1)
             .assert_empty();
    });
}

pub fn flat_map<I, L>(&self, logic: L) -> Collection<G, I::Item, R> where
    G::Timestamp: Clone,
    I: IntoIterator,
    I::Item: Data,
    L: FnMut(D) -> I + 'static, 
[src]

Creates a new collection by applying the supplied function to each input element and accumulating the results.

This method extracts an iterator from each input element, and extracts the full contents of the iterator. Be warned that if the iterators produce substantial amounts of data, they are currently fully drained before attempting to consolidate the results.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .flat_map(|x| 0 .. x);
    });
}

pub fn filter<L>(&self, logic: L) -> Collection<G, D, R> where
    L: FnMut(&D) -> bool + 'static, 
[src]

Creates a new collection containing those input records satisfying the supplied predicate.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map(|x| x * 2)
             .filter(|x| x % 2 == 1)
             .assert_empty();
    });
}

pub fn concat(&self, other: &Collection<G, D, R>) -> Collection<G, D, R>[src]

Creates a new collection accumulating the contents of the two collections.

Despite the name, differential dataflow collections are unordered. This method is so named because the implementation is the concatenation of the stream of updates, but it corresponds to the addition of the two collections.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let odds = data.filter(|x| x % 2 == 1);
        let evens = data.filter(|x| x % 2 == 0);

        odds.concat(&evens)
            .assert_eq(&data);
    });
}

pub fn concatenate<I>(&self, sources: I) -> Collection<G, D, R> where
    I: IntoIterator<Item = Collection<G, D, R>>, 
[src]

Creates a new collection accumulating the contents of the two collections.

Despite the name, differential dataflow collections are unordered. This method is so named because the implementation is the concatenation of the stream of updates, but it corresponds to the addition of the two collections.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let odds = data.filter(|x| x % 2 == 1);
        let evens = data.filter(|x| x % 2 == 0);

        odds.concatenate(Some(evens))
            .assert_eq(&data);
    });
}

pub fn explode<D2, R2, I, L>(
    &self,
    logic: L
) -> Collection<G, D2, <R2 as Mul<R>>::Output> where
    D2: Data,
    R2: Semigroup + Mul<R>,
    <R2 as Mul<R>>::Output: Data + Semigroup,
    I: IntoIterator<Item = (D2, R2)>,
    L: FnMut(D) -> I + 'static, 
[src]

Replaces each record with another, with a new difference type.

This method is most commonly used to take records containing aggregatable data (e.g. numbers to be summed) and move the data into the difference component. This will allow differential dataflow to update in-place.

#Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let nums = scope.new_collection_from(0 .. 10).1;
        let x1 = nums.flat_map(|x| 0 .. x);
        let x2 = nums.map(|x| (x, 9 - x))
                     .explode(|(x,y)| Some((x,y)));

        x1.assert_eq(&x2);
    });
}

pub fn enter<'a, T>(
    &self,
    child: &Child<'a, G, T>
) -> Collection<Child<'a, G, T>, D, R> where
    T: Refines<<G as ScopeParent>::Timestamp>, 
[src]

Brings a Collection into a nested scope.

Examples

extern crate timely;
extern crate differential_dataflow;

use timely::dataflow::Scope;
use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let result = scope.region(|child| {
            data.enter(child)
                .leave()
        });

        data.assert_eq(&result);
    });
}

pub fn enter_at<'a, T, F>(
    &self,
    child: &Iterative<'a, G, T>,
    initial: F
) -> Collection<Iterative<'a, G, T>, D, R> where
    T: Timestamp + Hash,
    F: FnMut(&D) -> T + Clone + 'static,
    G::Timestamp: Hash
[src]

Brings a Collection into a nested scope, at varying times.

The initial function indicates the time at which each element of the Collection should appear.

Examples

extern crate timely;
extern crate differential_dataflow;

use timely::dataflow::Scope;
use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let result = scope.iterative::<u64,_,_>(|child| {
            data.enter_at(child, |x| *x)
                .leave()
        });

        data.assert_eq(&result);
    });
}

pub fn delay<F>(&self, func: F) -> Collection<G, D, R> where
    F: FnMut(&G::Timestamp) -> G::Timestamp + Clone + 'static, 
[src]

Delays each difference by a supplied function.

It is assumed that func only advances timestamps; this is not verified, and things may go horribly wrong if that assumption is incorrect. It is also critical that func be monotonic: if two times are ordered, they should have the same order once func is applied to them (this is because we advance the timely capability with the same logic, and it must remain less_equal to all of the data timestamps).

pub fn inspect<F>(&self, func: F) -> Collection<G, D, R> where
    F: FnMut(&(D, G::Timestamp, R)) + 'static, 
[src]

Applies a supplied function to each update.

This method is most commonly used to report information back to the user, often for debugging purposes. Any function can be used here, but be warned that the incremental nature of differential dataflow does not guarantee that it will be called as many times as you might expect.

The (data, time, diff) triples indicate a change diff to the frequency of data which takes effect at the logical time time. When times are totally ordered (for example, usize), these updates reflect the changes along the sequence of collections. For partially ordered times, the mathematics are more interesting and less intuitive, unfortunately.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map_in_place(|x| *x *= 2)
             .filter(|x| x % 2 == 1)
             .inspect(|x| println!("error: {:?}", x));
    });
}

pub fn inspect_batch<F>(&self, func: F) -> Collection<G, D, R> where
    F: FnMut(&G::Timestamp, &[(D, G::Timestamp, R)]) + 'static, 
[src]

Applies a supplied function to each batch of updates.

This method is analogous to inspect, but operates on batches and reveals the timestamp of the timely dataflow capability associated with the batch of updates. The observed batching depends on how the system executes, and may vary run to run.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map_in_place(|x| *x *= 2)
             .filter(|x| x % 2 == 1)
             .inspect_batch(|t,xs| println!("errors @ {:?}: {:?}", t, xs));
    });
}

pub fn probe(&self) -> Handle<G::Timestamp>[src]

Attaches a timely dataflow probe to the output of a Collection.

This probe is used to determine when the state of the Collection has stabilized and can be read out.

pub fn probe_with(
    &self,
    handle: &mut Handle<G::Timestamp>
) -> Collection<G, D, R>
[src]

Attaches a timely dataflow probe to the output of a Collection.

This probe is used to determine when the state of the Collection has stabilized and all updates observed. In addition, a probe is also often use to limit the number of rounds of input in flight at any moment; a computation can wait until the probe has caught up to the input before introducing more rounds of data, to avoid swamping the system.

pub fn assert_empty(&self) where
    D: ExchangeData + Hashable,
    R: ExchangeData + Hashable,
    G::Timestamp: Lattice + Ord
[src]

Assert if the collection is ever non-empty.

Because this is a dataflow fragment, the test is only applied as the computation is run. If the computation is not run, or not run to completion, there may be un-exercised times at which the collection could be non-empty. Typically, a timely dataflow computation runs to completion on drop, and so clean exit from a program should indicate that this assertion never found cause to complain.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {
        scope.new_collection_from(1 .. 10).1
             .map(|x| x * 2)
             .filter(|x| x % 2 == 1)
             .assert_empty();
    });
}

pub fn scope(&self) -> G[src]

The scope containing the underlying timely dataflow stream.

impl<'a, G: Scope, T: Timestamp, D: Data, R: Semigroup> Collection<Child<'a, G, T>, D, R> where
    T: Refines<<G as ScopeParent>::Timestamp>, 
[src]

pub fn leave(&self) -> Collection<G, D, R>[src]

Returns the final value of a Collection from a nested scope to its containing scope.

Examples

extern crate timely;
extern crate differential_dataflow;

use timely::dataflow::Scope;
use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let result = scope.region(|child| {
            data.enter(child)
                .leave()
        });

        data.assert_eq(&result);
    });
}

impl<G: Scope, D: Data, R: Abelian> Collection<G, D, R> where
    G::Timestamp: Data
[src]

pub fn negate(&self) -> Collection<G, D, R>[src]

Creates a new collection whose counts are the negation of those in the input.

This method is most commonly used with concat to get those element in one collection but not another. However, differential dataflow computations are still defined for all values of the difference type R, including negative counts.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let odds = data.filter(|x| x % 2 == 1);
        let evens = data.filter(|x| x % 2 == 0);

        odds.negate()
            .concat(&data)
            .assert_eq(&evens);
    });
}

pub fn assert_eq(&self, other: &Self) where
    D: ExchangeData + Hashable,
    R: ExchangeData + Hashable,
    G::Timestamp: Lattice + Ord
[src]

Assert if the collections are ever different.

Because this is a dataflow fragment, the test is only applied as the computation is run. If the computation is not run, or not run to completion, there may be un-exercised times at which the collections could vary. Typically, a timely dataflow computation runs to completion on drop, and so clean exit from a program should indicate that this assertion never found cause to complain.

Examples

extern crate timely;
extern crate differential_dataflow;

use differential_dataflow::input::Input;

fn main() {
    ::timely::example(|scope| {

        let data = scope.new_collection_from(1 .. 10).1;

        let odds = data.filter(|x| x % 2 == 1);
        let evens = data.filter(|x| x % 2 == 0);

        odds.concat(&evens)
            .assert_eq(&data);
    });
}

Trait Implementations

impl<G, K, V, R> Arrange<G, K, V, R> for Collection<G, (K, V), R> where
    G: Scope,
    G::Timestamp: Lattice + Ord,
    K: ExchangeData + Hashable,
    V: ExchangeData,
    R: Semigroup + ExchangeData
[src]

impl<G: Scope, K: ExchangeData + Hashable, R: ExchangeData + Semigroup> Arrange<G, K, (), R> for Collection<G, K, R> where
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, K: ExchangeData + Hashable, V: ExchangeData, R: ExchangeData + Semigroup> ArrangeByKey<G, K, V, R> for Collection<G, (K, V), R> where
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, K: ExchangeData + Hashable, R: ExchangeData + Semigroup> ArrangeBySelf<G, K, R> for Collection<G, K, R> where
    G::Timestamp: Lattice + Ord
[src]

impl<G, K, V, R> Reduce<G, K, V, R> for Collection<G, (K, V), R> where
    G: Scope,
    G::Timestamp: Lattice + Ord,
    K: ExchangeData + Hashable,
    V: ExchangeData,
    R: ExchangeData + Semigroup
[src]

impl<G: Scope, K: ExchangeData + Hashable, R1: ExchangeData + Semigroup> Threshold<G, K, R1> for Collection<G, K, R1> where
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, K: ExchangeData + Hashable, R: ExchangeData + Semigroup> Count<G, K, R> for Collection<G, K, R> where
    G::Timestamp: Lattice + Ord
[src]

impl<G, K, V, R> ReduceCore<G, K, V, R> for Collection<G, (K, V), R> where
    G: Scope,
    G::Timestamp: Lattice + Ord,
    K: ExchangeData + Hashable,
    V: ExchangeData,
    R: ExchangeData + Semigroup
[src]

impl<G: Scope, D, R> Consolidate<D> for Collection<G, D, R> where
    D: ExchangeData + Hashable,
    R: ExchangeData + Semigroup,
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, D, R> ConsolidateStream<D> for Collection<G, D, R> where
    D: ExchangeData + Hashable,
    R: ExchangeData + Semigroup,
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, D: Ord + Data + Debug, R: Abelian> Iterate<G, D, R> for Collection<G, D, R>[src]

impl<G, K, V, R> Join<G, K, V, R> for Collection<G, (K, V), R> where
    G: Scope,
    K: ExchangeData + Hashable,
    V: ExchangeData,
    R: ExchangeData + Semigroup,
    G::Timestamp: Lattice + Ord
[src]

impl<G, K, V, R> JoinCore<G, K, V, R> for Collection<G, (K, V), R> where
    G: Scope,
    K: ExchangeData + Hashable,
    V: ExchangeData,
    R: ExchangeData + Semigroup,
    G::Timestamp: Lattice + Ord
[src]

impl<G: Scope, K: ExchangeData + Hashable, R: ExchangeData + Semigroup> CountTotal<G, K, R> for Collection<G, K, R> where
    G::Timestamp: TotalOrder + Lattice + Ord
[src]

impl<G: Scope, K: ExchangeData + Hashable, R: ExchangeData + Semigroup> ThresholdTotal<G, K, R> for Collection<G, K, R> where
    G::Timestamp: TotalOrder + Lattice + Ord
[src]

impl<G, D, R> Identifiers<G, D, R> for Collection<G, D, R> where
    G: Scope,
    G::Timestamp: Lattice,
    D: ExchangeData + Hash,
    R: ExchangeData + Abelian
[src]

impl<G, K, D> PrefixSum<G, K, D> for Collection<G, ((usize, K), D)> where
    G: Scope,
    G::Timestamp: Lattice,
    K: ExchangeData + Hash,
    D: ExchangeData + Hash
[src]

impl<G: Clone + Scope, D: Clone, R: Clone + Semigroup> Clone for Collection<G, D, R> where
    G::Timestamp: Clone
[src]

Auto Trait Implementations

impl<G, D, R = isize> !Send for Collection<G, D, R>

impl<G, D, R = isize> !Sync for Collection<G, D, R>

impl<G, D, R> Unpin for Collection<G, D, R> where
    G: Unpin

impl<G, D, R = isize> !UnwindSafe for Collection<G, D, R>

impl<G, D, R = isize> !RefUnwindSafe for Collection<G, D, R>

Blanket Implementations

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> From<T> for T[src]

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Data for T where
    T: 'static + Clone
[src]