pub struct Chain { /* private fields */ }
Expand description
Simple second order Markov chain. This chain might behave in ways you do not expect; Since we
are looking at Token
s, and not words. If this is not desired, you
can use your own splitting of tokens and use ChainBuilder::feed_tokens()
.
use markovish::IntoChainBuilder;
// You can use `.into_cb()` for the result of `feed_*` methods. This way, you can
// ignore if the feed was successfull (enough tokens were provided) or not.
let chain = Chain::builder().feed_str("I am &str").into_cb().build().unwrap();
// You would expect this to be "&str", but no!
assert_eq!(
chain.generate_next_token(&mut thread_rng(), &("I", "am")).as_deref(),
None
);
// We have a space which is a token!
assert_eq!(
chain.generate_next_token(&mut thread_rng(), &("I", " ")).as_deref(),
Some("am")
);
Implementations§
source§impl Chain
impl Chain
sourcepub fn from_text(content: &str) -> Result<Self, ChainBuilder>
pub fn from_text(content: &str) -> Result<Self, ChainBuilder>
Creates a new second order Markov chain from a string.
If the provided text is not long enough to create a Chain
,
an empty ChainBuilder
is returned instead.
Examples found in repository?
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
fn main() {
let args: Vec<String> = std::env::args().collect();
if args.len() != 3 {
println!("{USAGE}");
exit(1);
}
let text =
std::fs::read_to_string(PathBuf::from(args[1].clone())).expect("could not read file");
let chain = Chain::from_text(&text).unwrap();
let gen_text = chain
.generate_str(
&mut thread_rng(),
args[2]
.parse()
.expect("did not provide a valid token number"),
)
.expect("failed to generate text");
println!("{}", gen_text.join(""));
}
pub fn builder() -> ChainBuilder
sourcepub fn pairs(&self) -> impl Iterator<Item = &TokenPair>
pub fn pairs(&self) -> impl Iterator<Item = &TokenPair>
Returns an iterator of all pairs that have been found in the source text(s). When calling
Chain::start_tokens()
, a TokenPair
is randomly chosen from this list.
This can be used together with Chain::generate_max_n_tokens()
to get more fine-grained
control of how the chain is restarted if it stumbles on a token pair with no possible next
token. You can filter the pairs so that they are more likely to start a sentence.
§Examples
let chain = Chain::from_text("I am but a tiny example! I have three sentences. U?").unwrap();
let good_starting_points: Vec<_> = chain.pairs()
.filter(|tp| tp.0.as_str() == "." || tp.0.as_str() == "!")
.collect();
assert_eq!(good_starting_points.len(), 2);
sourcepub fn start_tokens(&self, rng: &mut impl Rng) -> Option<&TokenPair>
pub fn start_tokens(&self, rng: &mut impl Rng) -> Option<&TokenPair>
Randomly chooses two tokens that are known to be able to generate a new token. If no
start tokens exist, None
is returned.
While this is an easy way, the returned value can be any two pairs of token in
the source text. If you need more control, you could first filter on Chain::pairs()
,
and then randomly choose starting tokens from that subset.
sourcepub fn generate_str(&self, rng: &mut impl Rng, n: usize) -> Option<Vec<&str>>
pub fn generate_str(&self, rng: &mut impl Rng, n: usize) -> Option<Vec<&str>>
Examples found in repository?
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
fn main() {
let args: Vec<String> = std::env::args().collect();
if args.len() != 3 {
println!("{USAGE}");
exit(1);
}
let text =
std::fs::read_to_string(PathBuf::from(args[1].clone())).expect("could not read file");
let chain = Chain::from_text(&text).unwrap();
let gen_text = chain
.generate_str(
&mut thread_rng(),
args[2]
.parse()
.expect("did not provide a valid token number"),
)
.expect("failed to generate text");
println!("{}", gen_text.join(""));
}
sourcepub fn generate_next_token(
&self,
rng: &mut impl Rng,
prev: &TokenPairRef<'_>
) -> Option<TokenRef<'_>>
pub fn generate_next_token( &self, rng: &mut impl Rng, prev: &TokenPairRef<'_> ) -> Option<TokenRef<'_>>
Generates a random new token using the previous tokens.
If the chain has never seen the prev
tokens together, None
is returned.
sourcepub fn generate_n_tokens(
&self,
rng: &mut impl Rng,
prev: &TokenPairRef<'_>,
n: usize
) -> Option<Vec<TokenRef<'_>>>
pub fn generate_n_tokens( &self, rng: &mut impl Rng, prev: &TokenPairRef<'_>, n: usize ) -> Option<Vec<TokenRef<'_>>>
Generates n
tokens, using previously used tokens to generate new ones. If two tokens are found that have never been seen before,
two new starting tokens are generated using Chain::start_tokens()
.
If the chain has never seen the prev
tokens together, None
is returned.
§Panics
Will panic if n
is so big no vector can hold that many elements.
sourcepub fn generate_max_n_tokens(
&self,
rng: &mut impl Rng,
prev: &TokenPairRef<'_>,
n: usize
) -> Option<Vec<TokenRef<'_>>>
pub fn generate_max_n_tokens( &self, rng: &mut impl Rng, prev: &TokenPairRef<'_>, n: usize ) -> Option<Vec<TokenRef<'_>>>
Generates n
tokens, using previously used tokens to generate new ones. Less tokens may
be generated, if two tokens are found that have never been seen before.
If the chain has never seen the prev
tokens together, None
is returned.
§Panics
Will panic if n
is so big no vector can hold that many elements.