Struct markov_generator::Chain

source ·
pub struct Chain<T> { /* private fields */ }
Expand description

A Markov chain.

This type implements Serialize and Deserialize when the serde feature is enabled (which it is by default).

Implementations§

Creates a new chain.

See Self::depth for an explanation of the depth.

Examples found in repository?
examples/example.rs (line 23)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
fn main() {
    const DEPTH: usize = 6;
    // Maps each sequence of 6 items to a list of possible next items.
    let mut chain = Chain::new(DEPTH);

    // In this case, corpus.txt contains one paragraph per line.
    let file = File::open("examples/corpus.txt").unwrap();
    let mut reader = BufReader::new(file);
    let mut line = String::new();
    let mut prev_line = String::new();

    while let Ok(1..) = reader.read_line(&mut line) {
        // `Both` means that the generated random output could start with the
        // beginning of `line`, and that the generated output could end after
        // the end of `line`.
        chain.add_all(line.chars(), AddEdges::Both);

        // Starting index of last `DEPTH` chars in `prev_line`.
        let prev_tail =
            prev_line.char_indices().nth_back(DEPTH - 1).map_or(0, |(i, _)| i);

        // This makes sure there's a chance that the end of the previous line
        // could be followed by the start of the current line when generating
        // random output.
        chain.add_all(
            prev_line[prev_tail..].chars().chain(line.chars().take(DEPTH)),
            AddEdges::Neither,
        );

        std::mem::swap(&mut line, &mut prev_line);
        line.clear();
    }

    // Generate and print random Markov data.
    let output: String = chain
        .generate()
        .flat_map(|c| iter::repeat(c).take(1 + (*c == '\n') as usize))
        .collect();
    print!("{}", &output[..output.len() - 1]);
}

Gets the chain’s depth.

A depth of n means the chain maps sequences of n items of type T to a list of possible next items.

Examples found in repository?
src/lib.rs (line 660)
651
652
653
654
655
656
657
658
659
660
661
662
    pub fn set_state<I>(&mut self, state: I)
    where
        I: IntoIterator<Item = Option<&'a T>>,
    {
        self.buf.clear();
        self.buf.extend(
            state
                .into_iter()
                .chain(iter::repeat(None))
                .take(self.chain.depth()),
        );
    }

Adds all items in an iterator to the chain.

This essentially calls Self::add_next on every overlapping window of self.depth() + 1 elements.

edges controls whether the start or end of items can be used as start or end data for the chain. See the documentation for AddEdges for more information.

Examples found in repository?
examples/example.rs (line 35)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
fn main() {
    const DEPTH: usize = 6;
    // Maps each sequence of 6 items to a list of possible next items.
    let mut chain = Chain::new(DEPTH);

    // In this case, corpus.txt contains one paragraph per line.
    let file = File::open("examples/corpus.txt").unwrap();
    let mut reader = BufReader::new(file);
    let mut line = String::new();
    let mut prev_line = String::new();

    while let Ok(1..) = reader.read_line(&mut line) {
        // `Both` means that the generated random output could start with the
        // beginning of `line`, and that the generated output could end after
        // the end of `line`.
        chain.add_all(line.chars(), AddEdges::Both);

        // Starting index of last `DEPTH` chars in `prev_line`.
        let prev_tail =
            prev_line.char_indices().nth_back(DEPTH - 1).map_or(0, |(i, _)| i);

        // This makes sure there's a chance that the end of the previous line
        // could be followed by the start of the current line when generating
        // random output.
        chain.add_all(
            prev_line[prev_tail..].chars().chain(line.chars().take(DEPTH)),
            AddEdges::Neither,
        );

        std::mem::swap(&mut line, &mut prev_line);
        line.clear();
    }

    // Generate and print random Markov data.
    let output: String = chain
        .generate()
        .flat_map(|c| iter::repeat(c).take(1 + (*c == '\n') as usize))
        .collect();
    print!("{}", &output[..output.len() - 1]);
}

Adds items to the chain.

The first self.depth() + 1 items are added, increasing the chance that the first self.depth() items will be followed by the remaining item.

If items.into_iter() yields fewer than self.depth() items, this function is a no-op. If it yields exactly self.depth() items, the remaining item is treated as None.

Examples found in repository?
src/lib.rs (line 448)
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
    pub fn add_start<I>(&mut self, items: I)
    where
        I: IntoIterator<Item = T>,
        I::IntoIter: Clone,
    {
        let iter = items.into_iter().map(Some);
        for i in (2..self.depth).rev() {
            self.add(iter::repeat_with(|| None).take(i).chain(iter.clone()));
        }
        if self.depth > 0 {
            self.add(iter::once(None).chain(iter));
        }
    }

    /// Convenience function that wraps each item in [`Some`] and calls
    /// [`Self::add`].
    ///
    /// Note that this function alone does not increase the chance that
    /// `items` will be returned by [`Self::get_start`]; [`Self::add_start`]
    /// (or manually passing [`None`]-prefixed sequences to [`Self::add`]) must
    /// be used.
    pub fn add_next<I: IntoIterator<Item = T>>(&mut self, items: I) {
        self.add(items.into_iter().map(Some));
    }

Adds items preceded by various amounts of Nones so that Self::get_start has a chance of returning those items.

Specifically, this function calls Self::add with i Nones followed by the items in items for every i from 1 to self.depth() (inclusive). This increases the chance that the first self.depth() items of items will be returned by Self::get_start.

Note that this function does not increase the chance that the first self.depth() items of items will be followed by the self.depth() + 1st item; Self::add_next or Self::add must be called.

If this function’s trait bounds (I::IntoIter: Clone) are a problem, you can use Self::add_all instead if T is Clone:

chain.add_all(items.into_iter().take(chain.depth()), AddEdges::Start);

Convenience function that wraps each item in Some and calls Self::add.

Note that this function alone does not increase the chance that items will be returned by Self::get_start; Self::add_start (or manually passing None-prefixed sequences to Self::add) must be used.

Available on crate feature std only.

Generates random Markov chain data.

Returns an iterator that yields the elements by reference. If you want them by value, simply use Iterator::cloned (as long as T is Clone).

Examples found in repository?
examples/example.rs (line 55)
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
fn main() {
    const DEPTH: usize = 6;
    // Maps each sequence of 6 items to a list of possible next items.
    let mut chain = Chain::new(DEPTH);

    // In this case, corpus.txt contains one paragraph per line.
    let file = File::open("examples/corpus.txt").unwrap();
    let mut reader = BufReader::new(file);
    let mut line = String::new();
    let mut prev_line = String::new();

    while let Ok(1..) = reader.read_line(&mut line) {
        // `Both` means that the generated random output could start with the
        // beginning of `line`, and that the generated output could end after
        // the end of `line`.
        chain.add_all(line.chars(), AddEdges::Both);

        // Starting index of last `DEPTH` chars in `prev_line`.
        let prev_tail =
            prev_line.char_indices().nth_back(DEPTH - 1).map_or(0, |(i, _)| i);

        // This makes sure there's a chance that the end of the previous line
        // could be followed by the start of the current line when generating
        // random output.
        chain.add_all(
            prev_line[prev_tail..].chars().chain(line.chars().take(DEPTH)),
            AddEdges::Neither,
        );

        std::mem::swap(&mut line, &mut prev_line);
        line.clear();
    }

    // Generate and print random Markov data.
    let output: String = chain
        .generate()
        .flat_map(|c| iter::repeat(c).take(1 + (*c == '\n') as usize))
        .collect();
    print!("{}", &output[..output.len() - 1]);
}

Like Self::generate, but takes a custom random number generator.

Examples found in repository?
src/lib.rs (line 474)
473
474
475
    pub fn generate(&self) -> Generator<'_, T, rand::rngs::ThreadRng> {
        self.generate_with_rng(rand::thread_rng())
    }
Available on crate feature std only.

Gets a random item that has followed items in the added data.

Only the first self.depth() items are considered. A return value of None either means those items were never followed by anything in the data passed to Self::add, or that None sometimes followed those items and that possibility happened to be picked by the random number generator.

Given iter as items.into_iter(), if iter.next() returns None, it is treated as if it returned Some(None).

Like Self::get, but takes a custom random number generator.

Examples found in repository?
src/lib.rs (line 499)
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
    pub fn get<'a, I>(&'a self, items: I) -> Option<&'a T>
    where
        I: IntoIterator<Item = Option<&'a T>>,
    {
        self.get_with_rng(items, rand::thread_rng())
    }

    /// Like [`Self::get`], but takes a custom random number generator.
    pub fn get_with_rng<'a, I, R>(
        &'a self,
        items: I,
        mut rng: R,
    ) -> Option<&'a T>
    where
        I: IntoIterator<Item = Option<&'a T>>,
        R: Rng,
    {
        let mut iter = items.into_iter();
        let mut node = &self.root;

        for _ in 0..self.depth {
            let internal = if let Node::Internal(internal) = node {
                internal
            } else {
                panic!("expected internal node");
            };

            node = if let Some(item) = iter.next().flatten() {
                internal.map.get(item)?
            } else {
                internal.null_next.as_ref()?
            };
        }

        let leaf = if let Node::Leaf(leaf) = node {
            leaf
        } else {
            panic!("expected leaf node");
        };

        if leaf.total == 0 {
            return None;
        }

        let mut n = rng.gen_range(0..leaf.total);
        for (item, count) in &leaf.map {
            n = if let Some(n) = n.checked_sub(*count) {
                n
            } else {
                return Some(item);
            }
        }
        None
    }

    #[cfg(feature = "std")]
    #[cfg_attr(feature = "doc_cfg", doc(cfg(feature = "std")))]
    /// Gets some initial items that have appeared at the start of a sequence
    /// (see [`Self::add_start`]).
    ///
    /// The returned iterator will yield up to [`self.depth()`](Self::depth)
    /// items.
    pub fn get_start(&self) -> impl Iterator<Item = &T> {
        self.get_start_with_rng(rand::thread_rng())
    }

    /// Like [`Self::get_start`], but takes a custom random number generator.
    pub fn get_start_with_rng<R: Rng>(
        &self,
        mut rng: R,
    ) -> impl Iterator<Item = &T> {
        let mut items = Vec::with_capacity(self.depth);
        for i in (1..=self.depth).rev() {
            if let Some(item) = self.get_with_rng(
                iter::repeat_with(|| None)
                    .take(i)
                    .chain(items.iter().copied().map(Some)),
                &mut rng,
            ) {
                items.push(item);
            } else {
                break;
            }
        }
        items.into_iter()
    }

    #[cfg(feature = "std")]
    #[cfg_attr(feature = "doc_cfg", doc(cfg(feature = "std")))]
    /// Convenience function that wraps each item in [`Some`] and calls
    /// [`Self::get`].
    pub fn get_next<'a, I>(&'a self, items: I) -> Option<&'a T>
    where
        I: IntoIterator<Item = &'a T>,
    {
        self.get_next_with_rng(items, rand::thread_rng())
    }

    /// Like [`Self::get_next`], but takes a custom random number generator.
    pub fn get_next_with_rng<'a, I, R>(
        &'a self,
        items: I,
        rng: R,
    ) -> Option<&'a T>
    where
        I: IntoIterator<Item = &'a T>,
        R: Rng,
    {
        self.get_with_rng(items.into_iter().map(Some), rng)
    }
}

/// Iterator returned by [`Chain::generate`].
pub struct Generator<'a, T, R> {
    chain: &'a Chain<T>,
    rng: R,
    buf: VecDeque<Option<&'a T>>,
}

impl<'a, T, R> Generator<'a, T, R> {
    /// Creates a new random Markov data generator. See [`Chain::generate`].
    pub fn new(chain: &'a Chain<T>, rng: R) -> Self {
        let mut buf = VecDeque::new();
        buf.resize(chain.depth, None);
        Self {
            chain,
            rng,
            buf,
        }
    }

    /// Gets the generator's state.
    ///
    /// This is a sequence of exactly [`self.chain.depth()`](Chain::depth)
    /// items. [`Self::next`] will pass the state to [`Chain::get`] and then
    /// update the state accordingly by popping the front item and pushing the
    /// result of [`Chain::get`] to the back.
    ///
    /// The initial state consists entirely of [`None`]s.
    #[rustfmt::skip]
    pub fn state(
        &self,
    ) -> impl '_
    + Clone
    + DoubleEndedIterator<Item = Option<&T>>
    + ExactSizeIterator {
        self.buf.iter().copied()
    }

    /// Sets the generator's state.
    ///
    /// This method sets the generator's state to the first
    /// [`self.chain.depth()`](Chain::depth) items in `state`, filling in with
    /// [`None`] if `state` yields fewer items.
    ///
    /// See [`Self::state`] for an explanation of how the state is used.
    pub fn set_state<I>(&mut self, state: I)
    where
        I: IntoIterator<Item = Option<&'a T>>,
    {
        self.buf.clear();
        self.buf.extend(
            state
                .into_iter()
                .chain(iter::repeat(None))
                .take(self.chain.depth()),
        );
    }
}

impl<'a, T: Clone + Item, R: Rng> Iterator for Generator<'a, T, R> {
    type Item = &'a T;

    fn next(&mut self) -> Option<&'a T> {
        let result =
            self.chain.get_with_rng(self.buf.iter().copied(), &mut self.rng);
        self.buf.pop_front();
        self.buf.push_back(result);
        result
    }
Available on crate feature std only.

Gets some initial items that have appeared at the start of a sequence (see Self::add_start).

The returned iterator will yield up to self.depth() items.

Like Self::get_start, but takes a custom random number generator.

Examples found in repository?
src/lib.rs (line 558)
557
558
559
    pub fn get_start(&self) -> impl Iterator<Item = &T> {
        self.get_start_with_rng(rand::thread_rng())
    }
Available on crate feature std only.

Convenience function that wraps each item in Some and calls Self::get.

Like Self::get_next, but takes a custom random number generator.

Examples found in repository?
src/lib.rs (line 590)
586
587
588
589
590
591
    pub fn get_next<'a, I>(&'a self, items: I) -> Option<&'a T>
    where
        I: IntoIterator<Item = &'a T>,
    {
        self.get_next_with_rng(items, rand::thread_rng())
    }

Trait Implementations§

Deserialize this value from the given Serde deserializer. Read more
Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Gets the TypeId of self. Read more
Immutably borrows from an owned value. Read more
Mutably borrows from an owned value. Read more

Returns the argument unchanged.

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.