pub struct LlamaBatch<'tokens> {
pub initialized_logits: Vec<i32>,
pub llama_batch: llama_batch,
/* private fields */
}Expand description
A safe wrapper around llama_batch.
PartialEq is intentionally not implemented because the underlying llama_batch
from the C API contains raw pointers whose address comparison would be meaningless.
Fields§
§initialized_logits: Vec<i32>The logits that are initialized. Used by [LlamaContext] to ensure that only initialized logits are accessed.
llama_batch: llama_batchThe underlying llama_batch from the C API.
Implementations§
Source§impl<'tokens> LlamaBatch<'tokens>
impl<'tokens> LlamaBatch<'tokens>
Sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clear the batch. This does not free the memory associated with the batch, but it does reset the number of tokens to 0.
Sourcepub fn add(
&mut self,
sampled_token: &SampledToken,
pos: llama_pos,
seq_ids: &[i32],
logits: bool,
) -> Result<(), BatchAddError>
pub fn add( &mut self, sampled_token: &SampledToken, pos: llama_pos, seq_ids: &[i32], logits: bool, ) -> Result<(), BatchAddError>
add a token to the batch for sequences seq_ids at position pos. If logits is true, the
token will be initialized and can be read from after the next decode.
§Errors
Returns an error if there is insufficient space in the buffer or if integer conversions fail.
Sourcepub fn add_sequence(
&mut self,
tokens: &[LlamaToken],
seq_id: i32,
logits_all: bool,
) -> Result<(), BatchAddError>
pub fn add_sequence( &mut self, tokens: &[LlamaToken], seq_id: i32, logits_all: bool, ) -> Result<(), BatchAddError>
Add a sequence of tokens to the batch for the given sequence id. If logits_all is true, the
tokens will be initialized and can be read from after the next decode.
Either way the last token in the sequence will have its logits set to true.
§Errors
Returns an error if there is insufficient space in the buffer or if integer conversions fail.
Sourcepub fn new(n_tokens: usize, n_seq_max: i32) -> Result<Self, BatchAddError>
pub fn new(n_tokens: usize, n_seq_max: i32) -> Result<Self, BatchAddError>
Create a new LlamaBatch that can contain up to n_tokens tokens.
§Arguments
n_tokens: the maximum number of tokens that can be added to the batchn_seq_max: the maximum number of sequences that can be added to the batch (generally 1 unless you know what you are doing)
§Errors
Returns an error if n_tokens exceeds i32::MAX.
Sourcepub fn get_one(tokens: &'tokens [LlamaToken]) -> Result<Self, BatchAddError>
pub fn get_one(tokens: &'tokens [LlamaToken]) -> Result<Self, BatchAddError>
llama_batch_get_one
Return batch for single sequence of tokens
NOTE: this is a helper function to facilitate transition to the new batch API
§Errors
Returns an error if the provided token buffer is empty or if integer conversions fail.
Trait Implementations§
Source§impl<'tokens> Debug for LlamaBatch<'tokens>
impl<'tokens> Debug for LlamaBatch<'tokens>
Auto Trait Implementations§
impl<'tokens> Freeze for LlamaBatch<'tokens>
impl<'tokens> RefUnwindSafe for LlamaBatch<'tokens>
impl<'tokens> !Send for LlamaBatch<'tokens>
impl<'tokens> !Sync for LlamaBatch<'tokens>
impl<'tokens> Unpin for LlamaBatch<'tokens>
impl<'tokens> UnsafeUnpin for LlamaBatch<'tokens>
impl<'tokens> UnwindSafe for LlamaBatch<'tokens>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more