Skip to main content

MtmdContext

Struct MtmdContext 

Source
pub struct MtmdContext { /* private fields */ }
Expand description

The main multimodal context.

Wraps a mtmd_context *. This context is tied to a specific mmproj model file and a loaded LlamaModel. It is safe to share across threads for tokenize calls (read-only), but encode_chunk / eval helpers mutate internal state and must not be called concurrently.

Implementations§

Source§

impl MtmdContext

Source

pub fn default_marker() -> &'static str

Returns the default media marker string used in prompts (currently "<__media__>").

Source

pub fn init_from_file( mmproj_path: impl AsRef<Path>, text_model: &LlamaModel, params: MtmdContextParams, ) -> Result<Self>

Initialise a multimodal context from an mmproj GGUF file.

§Parameters
  • mmproj_path – path to the mmproj .gguf file
  • text_model – the already-loaded text model
  • params – context parameters (use MtmdContextParams::default())
§Errors

Returns MtmdError::ContextCreateFailed if the underlying C call returns a null pointer.

Source

pub fn void_logs()

Silence all clip/mtmd log output by installing a no-op callback.

Call this right after init_from_file to suppress the verbose clip_model_loader: tensor[N]… lines that clip.cpp emits to its own private logger (separate from llama_log_set).

Source

pub fn supports_vision(&self) -> bool

Returns true if the model supports vision (image) input.

Source

pub fn supports_audio(&self) -> bool

Returns true if the model supports audio input.

Source

pub fn audio_bitrate(&self) -> i32

👎Deprecated:

use audio_sample_rate() instead

Returns the audio sample rate in Hz (e.g. 16 000 for Whisper), or -1 if audio is not supported.

Source

pub fn audio_sample_rate(&self) -> i32

Returns the audio sample rate in Hz.

Source

pub fn decode_use_non_causal(&self) -> bool

Whether llama_decode must use a non-causal attention mask when decoding image embeddings for this model.

Source

pub fn decode_use_mrope(&self) -> bool

Whether the model uses M-RoPE for llama_decode.

Source

pub fn tokenize( &self, text: &MtmdInputText<'_>, bitmaps: &[&MtmdBitmap], output: &mut MtmdInputChunks, ) -> Result<()>

Tokenize a text prompt that contains one or more media markers.

The number of bitmaps must equal the number of media markers in the prompt text, otherwise [MtmdError::TokenizeError(1)] is returned.

This call is thread-safe (shared &self).

§Parameters
  • text – text + tokenisation options
  • bitmaps – slice of MtmdBitmap references, one per media marker
  • output – an MtmdInputChunks that will be populated with the result
§Errors

Returns MtmdError::TokenizeError if tokenization fails.

Source

pub fn encode_chunk(&self, chunk: &MtmdInputChunk<'_>) -> Result<()>

Encode a single input chunk (image or audio) and store the resulting embeddings inside the context.

After a successful call, the embeddings can be retrieved with MtmdContext::output_embd.

This call is NOT thread-safe.

§Errors

Returns MtmdError::EncodeError if encoding fails.

Source

pub fn output_embd(&self, n_elements: usize) -> &[f32]

Return a slice over the embeddings produced by the last encode_chunk call.

The length (in f32 elements) is:

n_embd_inp(model)  *  chunk.n_tokens()
§Safety

The returned slice is valid until the next call that mutates the context (e.g. another encode_chunk).

Source

pub fn eval_chunks( &self, lctx: *mut llama_context, chunks: &MtmdInputChunks, n_past: i32, seq_id: i32, n_batch: i32, logits_last: bool, new_n_past: &mut i32, ) -> Result<()>

High-level helper: evaluate (decode) all chunks in sequence.

  • Text chunks are decoded via llama_decode.
  • Image/audio chunks are first encoded with mtmd_encode_chunk and then decoded via llama_decode.

On success new_n_past is updated with the new past position.

This call is NOT thread-safe.

§Parameters
  • lctx – raw pointer to the llama context (from LlamaContext::as_ptr)
  • chunks – the tokenized chunks to evaluate
  • n_past – current KV-cache position
  • seq_id – sequence ID
  • n_batch – maximum batch size (must be ≥ 1)
  • logits_last – if true, compute logits only for the final token
  • new_n_past – updated KV-cache position after the call
§Errors

Returns MtmdError::EvalError if evaluation fails.

Source

pub fn eval_chunk_single( &self, lctx: *mut llama_context, chunk: &MtmdInputChunk<'_>, n_past: i32, seq_id: i32, n_batch: i32, logits_last: bool, new_n_past: &mut i32, ) -> Result<()>

High-level helper: evaluate a single chunk.

Works identically to eval_chunks but operates on one chunk at a time.

§Errors

Returns MtmdError::EvalError if evaluation fails.

Source

pub fn as_ptr(&self) -> *mut mtmd_context

Returns a raw pointer to the underlying mtmd_context.

§Safety

The returned pointer is valid for the lifetime of this MtmdContext. The caller must not free it.

Trait Implementations§

Source§

impl Debug for MtmdContext

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Drop for MtmdContext

Source§

fn drop(&mut self)

Executes the destructor for this type. Read more
Source§

impl Send for MtmdContext

Source§

impl Sync for MtmdContext

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more