Struct llama_cpp_2::context::LlamaContext
source · pub struct LlamaContext<'a> {
pub model: &'a LlamaModel,
/* private fields */
}
Expand description
Safe wrapper around llama_context
.
Fields§
§model: &'a LlamaModel
a reference to the contexts model.
Implementations§
source§impl LlamaContext<'_>
impl LlamaContext<'_>
sourcepub fn copy_cache(&mut self, src: i32, dest: i32, size: i32)
pub fn copy_cache(&mut self, src: i32, dest: i32, size: i32)
Copy the cache from one sequence to another.
Parameters
src
- The sequence id to copy the cache from.dest
- The sequence id to copy the cache to.size
- The size of the cache to copy.
sourcepub fn clear_kv_cache_seq(&mut self, src: i32, p0: Option<u16>, p1: Option<u16>)
pub fn clear_kv_cache_seq(&mut self, src: i32, p0: Option<u16>, p1: Option<u16>)
Clear the kv cache for the given sequence.
Parameters
src
- The sequence id to clear the cache for.p0
- The start position of the cache to clear. IfNone
, the entire cache is cleared up to [p1].p1
- The end position of the cache to clear. IfNone
, the entire cache is cleared from [p0].
source§impl LlamaContext<'_>
impl LlamaContext<'_>
sourcepub fn sample(&mut self, sampler: Sampler<'_>) -> LlamaToken
pub fn sample(&mut self, sampler: Sampler<'_>) -> LlamaToken
sourcepub fn grammar_accept_token(
&mut self,
grammar: &mut LlamaGrammar,
token: LlamaToken
)
pub fn grammar_accept_token( &mut self, grammar: &mut LlamaGrammar, token: LlamaToken )
Accept a token into the grammar.
sourcepub fn sample_grammar(
&mut self,
llama_token_data_array: &mut LlamaTokenDataArray,
llama_grammar: &LlamaGrammar
)
pub fn sample_grammar( &mut self, llama_token_data_array: &mut LlamaTokenDataArray, llama_grammar: &LlamaGrammar )
Perform grammar sampling.
sourcepub fn sample_temp(
&self,
token_data: &mut LlamaTokenDataArray,
temperature: f32
)
pub fn sample_temp( &self, token_data: &mut LlamaTokenDataArray, temperature: f32 )
Modify [token_data
] in place using temperature sampling.
Panics
- [
temperature
] is not between 0.0 and 1.0
sourcepub fn sample_token_greedy(&self, token_data: LlamaTokenDataArray) -> LlamaToken
pub fn sample_token_greedy(&self, token_data: LlamaTokenDataArray) -> LlamaToken
sourcepub fn sample_token_softmax(&self, token_data: &mut LlamaTokenDataArray)
pub fn sample_token_softmax(&self, token_data: &mut LlamaTokenDataArray)
Sorts candidate tokens by their logits in descending order and calculate probabilities based on logits.
source§impl<'model> LlamaContext<'model>
impl<'model> LlamaContext<'model>
sourcepub fn decode(&mut self, batch: &mut LlamaBatch) -> Result<(), DecodeError>
pub fn decode(&mut self, batch: &mut LlamaBatch) -> Result<(), DecodeError>
sourcepub fn candidates_ith(
&self,
i: i32
) -> impl Iterator<Item = LlamaTokenData> + '_
pub fn candidates_ith( &self, i: i32 ) -> impl Iterator<Item = LlamaTokenData> + '_
sourcepub fn get_logits_ith(&self, i: i32) -> &[f32]
pub fn get_logits_ith(&self, i: i32) -> &[f32]
Get the logits for the ith token in the context.
Panics
i
is greater thann_ctx
n_vocab
does not fit into a usize- logit
i
is not initialized.
sourcepub fn reset_timings(&mut self)
pub fn reset_timings(&mut self)
Reset the timings for the context.
sourcepub fn timings(&mut self) -> LlamaTimings
pub fn timings(&mut self) -> LlamaTimings
Returns the timings for the context.
Trait Implementations§
source§impl Debug for LlamaContext<'_>
impl Debug for LlamaContext<'_>
Auto Trait Implementations§
impl<'a> RefUnwindSafe for LlamaContext<'a>
impl<'a> !Send for LlamaContext<'a>
impl<'a> !Sync for LlamaContext<'a>
impl<'a> Unpin for LlamaContext<'a>
impl<'a> UnwindSafe for LlamaContext<'a>
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more