Struct llama_cpp_2::token::data_array::LlamaTokenDataArray
source · pub struct LlamaTokenDataArray {
pub data: Vec<LlamaTokenData>,
pub sorted: bool,
}
Expand description
a safe wrapper around llama_token_data_array
.
Fields§
§data: Vec<LlamaTokenData>
the underlying data
sorted: bool
is the data sorted?
Implementations§
source§impl LlamaTokenDataArray
impl LlamaTokenDataArray
sourcepub fn new(data: Vec<LlamaTokenData>, sorted: bool) -> Self
pub fn new(data: Vec<LlamaTokenData>, sorted: bool) -> Self
Create a new LlamaTokenDataArray
from a vector and weather or not the data is sorted.
let array = LlamaTokenDataArray::new(vec![
LlamaTokenData::new(LlamaToken(0), 0.0, 0.0),
LlamaTokenData::new(LlamaToken(1), 0.1, 0.1)
], false);
assert_eq!(array.data.len(), 2);
assert_eq!(array.sorted, false);
sourcepub fn from_iter<T>(data: T, sorted: bool) -> LlamaTokenDataArraywhere
T: IntoIterator<Item = LlamaTokenData>,
pub fn from_iter<T>(data: T, sorted: bool) -> LlamaTokenDataArraywhere
T: IntoIterator<Item = LlamaTokenData>,
Create a new LlamaTokenDataArray
from an iterator and weather or not the data is sorted.
let array = LlamaTokenDataArray::from_iter([
LlamaTokenData::new(LlamaToken(0), 0.0, 0.0),
LlamaTokenData::new(LlamaToken(1), 0.1, 0.1)
], false);
assert_eq!(array.data.len(), 2);
assert_eq!(array.sorted, false);
source§impl LlamaTokenDataArray
impl LlamaTokenDataArray
sourcepub fn sample_repetition_penalty(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
last_tokens: &[LlamaToken],
penalty_last_n: usize,
penalty_repeat: f32,
penalty_freq: f32,
penalty_present: f32,
)
pub fn sample_repetition_penalty( &mut self, ctx: Option<&mut LlamaContext<'_>>, last_tokens: &[LlamaToken], penalty_last_n: usize, penalty_repeat: f32, penalty_freq: f32, penalty_present: f32, )
Repetition penalty described in CTRL academic paper, with negative logit fix. Frequency and presence penalties described in OpenAI API.
§Parameters
-
ctx
- the context to use. May beNone
if you do not care to record the sample timings. -
last_tokens
- the last tokens in the context. -
penalty_last_n
- the number of tokens back to consider for the repetition penalty. (0 for no penalty) -
penalty_repeat
- the repetition penalty. (1.0 for no penalty) -
penalty_freq
- the frequency penalty. (0.0 for no penalty) -
penalty_present
- the presence penalty. (0.0 for no penalty)
§Example
let history = vec![
LlamaToken::new(2),
LlamaToken::new(1),
LlamaToken::new(0),
];
let candidates = vec![
LlamaToken::new(0),
LlamaToken::new(1),
LlamaToken::new(2),
LlamaToken::new(3),
];
let mut candidates = LlamaTokenDataArray::from_iter(candidates.iter().map(|&token| LlamaTokenData::new(token, 0.0, 0.0)), false);
candidates.sample_repetition_penalty(None, &history, 2, 1.1, 0.1, 0.1);
let token_logits = candidates.data.into_iter().map(|token_data| (token_data.id(), token_data.logit())).collect::<BTreeMap<_, _>>();
assert_eq!(token_logits[&LlamaToken(0)], 0.0, "expected no penalty as it is out of `penalty_last_n`");
assert!(token_logits[&LlamaToken(1)] < 0.0, "expected penalty as it is in `penalty_last_n`");
assert!(token_logits[&LlamaToken(2)] < 0.0, "expected penalty as it is in `penalty_last_n`");
assert_eq!(token_logits[&LlamaToken(3)], 0.0, "expected no penalty as it is not in `history`");
sourcepub fn sample_softmax(&mut self, ctx: Option<&mut LlamaContext<'_>>)
pub fn sample_softmax(&mut self, ctx: Option<&mut LlamaContext<'_>>)
Sorts candidate tokens by their logits in descending order and calculate probabilities based on logits.
§Example
let lowest = LlamaTokenData::new(LlamaToken::new(0), 0.1, 0.0);
let middle = LlamaTokenData::new(LlamaToken::new(1), 0.2, 0.0);
let highest = LlamaTokenData::new(LlamaToken::new(2), 0.7, 0.0);
let candidates = vec![lowest, middle, highest];
let mut candidates = LlamaTokenDataArray::from_iter(candidates, false);
candidates.sample_softmax(None);
assert!(candidates.sorted);
assert_eq!(candidates.data[0].id(), highest.id());
assert_eq!(candidates.data[0].logit(), highest.logit());
assert!(candidates.data[0].p() > candidates.data[1].p());
assert_eq!(candidates.data[1].id(), middle.id());
assert_eq!(candidates.data[1].logit(), middle.logit());
assert!(candidates.data[1].p() > candidates.data[2].p());
assert_eq!(candidates.data[2].id(), lowest.id());
assert_eq!(candidates.data[2].logit(), lowest.logit());
sourcepub fn sample_temp(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
temperature: f32,
)
pub fn sample_temp( &mut self, ctx: Option<&mut LlamaContext<'_>>, temperature: f32, )
Modify the logits of Self
in place using temperature sampling.
§Example
let candidates = vec![
LlamaTokenData::new(LlamaToken::new(0), 0.1, 0.0),
LlamaTokenData::new(LlamaToken::new(1), 0.2, 0.0),
LlamaTokenData::new(LlamaToken::new(2), 0.7, 0.0)
];
let mut candidates = LlamaTokenDataArray::from_iter(candidates, false);
candidates.sample_temp(None, 0.5);
assert_ne!(candidates.data[0].logit(), 0.1);
assert_ne!(candidates.data[1].logit(), 0.2);
assert_ne!(candidates.data[2].logit(), 0.7);
sourcepub fn sample_token(&mut self, ctx: &mut LlamaContext<'_>) -> LlamaToken
pub fn sample_token(&mut self, ctx: &mut LlamaContext<'_>) -> LlamaToken
Randomly selects a token from the candidates based on their probabilities.
sourcepub fn sample_top_k(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
k: i32,
min_keep: usize,
)
pub fn sample_top_k( &mut self, ctx: Option<&mut LlamaContext<'_>>, k: i32, min_keep: usize, )
Top-K sampling described in academic paper The Curious Case of Neural Text Degeneration
sourcepub fn sample_tail_free(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
z: f32,
min_keep: usize,
)
pub fn sample_tail_free( &mut self, ctx: Option<&mut LlamaContext<'_>>, z: f32, min_keep: usize, )
Tail Free Sampling described in Tail-Free-Sampling.
sourcepub fn sample_typical(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
p: f32,
min_keep: usize,
)
pub fn sample_typical( &mut self, ctx: Option<&mut LlamaContext<'_>>, p: f32, min_keep: usize, )
Locally Typical Sampling implementation described in the paper.
§Example
let candidates = vec![
LlamaTokenData::new(LlamaToken::new(0), 0.1, 0.0),
LlamaTokenData::new(LlamaToken::new(1), 0.2, 0.0),
LlamaTokenData::new(LlamaToken::new(2), 0.7, 0.0),
];
let mut candidates = LlamaTokenDataArray::from_iter(candidates, false);
candidates.sample_typical(None, 0.5, 1);
sourcepub fn sample_top_p(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
p: f32,
min_keep: usize,
)
pub fn sample_top_p( &mut self, ctx: Option<&mut LlamaContext<'_>>, p: f32, min_keep: usize, )
Nucleus sampling described in academic paper The Curious Case of Neural Text Degeneration
§Example
let candidates = vec![
LlamaTokenData::new(LlamaToken::new(0), 0.1, 0.0),
LlamaTokenData::new(LlamaToken::new(1), 0.2, 0.0),
LlamaTokenData::new(LlamaToken::new(2), 0.7, 0.0),
];
let mut candidates = LlamaTokenDataArray::from_iter(candidates, false);
candidates.sample_top_p(None, 0.5, 1);
assert_eq!(candidates.data.len(), 2);
assert_eq!(candidates.data[0].id(), LlamaToken::new(2));
assert_eq!(candidates.data[1].id(), LlamaToken::new(1));
sourcepub fn sample_min_p(
&mut self,
ctx: Option<&mut LlamaContext<'_>>,
p: f32,
min_keep: usize,
)
pub fn sample_min_p( &mut self, ctx: Option<&mut LlamaContext<'_>>, p: f32, min_keep: usize, )
Minimum P sampling as described in #3841
§Example
let candidates = vec![
LlamaTokenData::new(LlamaToken::new(4), 0.0001, 0.0),
LlamaTokenData::new(LlamaToken::new(0), 0.1, 0.0),
LlamaTokenData::new(LlamaToken::new(1), 0.2, 0.0),
LlamaTokenData::new(LlamaToken::new(2), 0.7, 0.0),
];
let mut candidates = LlamaTokenDataArray::from_iter(candidates, false);
candidates.sample_min_p(None, 0.05, 1);
sourcepub fn sample_token_mirostat_v2(
&mut self,
ctx: &mut LlamaContext<'_>,
tau: f32,
eta: f32,
mu: &mut f32,
) -> LlamaToken
pub fn sample_token_mirostat_v2( &mut self, ctx: &mut LlamaContext<'_>, tau: f32, eta: f32, mu: &mut f32, ) -> LlamaToken
Mirostat 2.0 algorithm described in the paper. Uses tokens instead of words.
§Parameters
tau
The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.eta
The learning rate used to updatemu
based on the error between the target and observed surprisal of the sampled word. A larger learning rate will causemu
to be updated more quickly, while a smaller learning rate will result in slower updates.mu
Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (2 * tau
) and is updated in the algorithm based on the error between the target and observed surprisal.
Trait Implementations§
source§impl Clone for LlamaTokenDataArray
impl Clone for LlamaTokenDataArray
source§fn clone(&self) -> LlamaTokenDataArray
fn clone(&self) -> LlamaTokenDataArray
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl Debug for LlamaTokenDataArray
impl Debug for LlamaTokenDataArray
source§impl PartialEq for LlamaTokenDataArray
impl PartialEq for LlamaTokenDataArray
impl StructuralPartialEq for LlamaTokenDataArray
Auto Trait Implementations§
impl Freeze for LlamaTokenDataArray
impl RefUnwindSafe for LlamaTokenDataArray
impl Send for LlamaTokenDataArray
impl Sync for LlamaTokenDataArray
impl Unpin for LlamaTokenDataArray
impl UnwindSafe for LlamaTokenDataArray
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
source§default unsafe fn clone_to_uninit(&self, dst: *mut T)
default unsafe fn clone_to_uninit(&self, dst: *mut T)
clone_to_uninit
)