pub struct ChatModel { /* private fields */ }Expand description
High-level model for token-level text generation.
Load a trained checkpoint and generate token IDs in a single call.
ChatModel automatically discovers the model config next to the
checkpoint file. Users bring their own tokenizer to encode/decode text.
§Example
use multiscreen_rs::prelude::*;
fn main() -> multiscreen_rs::Result<()> {
let model = ChatModel::load("checkpoints/latest.mpk")?;
let token_ids = model.generate(&[1, 2, 3], GenerationConfig::default())?;
println!("generated tokens: {:?}", token_ids);
Ok(())
}Implementations§
Source§impl ChatModel
impl ChatModel
Sourcepub fn load(path: impl AsRef<Path>) -> Result<Self>
pub fn load(path: impl AsRef<Path>) -> Result<Self>
Load a ChatModel from a checkpoint path.
path should point to a .mpk weights file (e.g.
"checkpoints/latest.mpk" or "runs/chat/checkpoints/latest.mpk").
The method resolves config.json relative to the checkpoint’s parent
directory for model architecture. Falls back to Params10M defaults.
Sourcepub fn generate(
&self,
prompt: &[u32],
config: GenerationConfig,
) -> Result<Vec<u32>>
pub fn generate( &self, prompt: &[u32], config: GenerationConfig, ) -> Result<Vec<u32>>
Generate token IDs from a prompt token sequence.
Returns all generated tokens (prompt + new) at once.
For streaming / token-by-token output, use Self::generate_stream.
Sourcepub fn generate_stream(
&self,
prompt: &[u32],
config: GenerationConfig,
on_token: impl FnMut(u32, usize) -> bool,
) -> Result<Vec<u32>>
pub fn generate_stream( &self, prompt: &[u32], config: GenerationConfig, on_token: impl FnMut(u32, usize) -> bool, ) -> Result<Vec<u32>>
Generate token IDs one at a time, invoking a callback for each newly produced token.
This enables streaming / word-by-word output similar to ChatGPT.
The callback receives (token_id, index) where index is the
zero-based position of the new token (0 = first generated token).
Return false from the callback to stop generation early.
Returns the full output sequence (prompt + generated tokens).
§Example
use multiscreen_rs::prelude::*;
fn main() -> multiscreen_rs::Result<()> {
let model = ChatModel::load("checkpoints/latest.mpk")?;
let prompt: &[u32] = &[1, 2, 3];
let full_output = model.generate_stream(
prompt,
GenerationConfig::default(),
|token_id, _index| {
// Stream each token as it is produced.
// Decode with YOUR tokenizer and print word-by-word.
print!("{} ", token_id); // Replace with actual decoding
true // return false to stop early
},
)?;
println!("\nFull sequence: {:?}", full_output);
Ok(())
}Sourcepub fn predict_logits(
&self,
context: &[u32],
) -> Result<Tensor<DefaultBackend, 3>>
pub fn predict_logits( &self, context: &[u32], ) -> Result<Tensor<DefaultBackend, 3>>
Run a forward pass on the padded context and return logits.
Returns a tensor of shape [1, seq_len, vocab_size].
Use this for custom sampling strategies (top-k, temperature, etc.).
Sourcepub fn model(&self) -> &MultiscreenModel<DefaultBackend>
pub fn model(&self) -> &MultiscreenModel<DefaultBackend>
Access the underlying neural model.
Sourcepub fn config(&self) -> &MultiscreenModelConfig
pub fn config(&self) -> &MultiscreenModelConfig
Access the model configuration.
Sourcepub fn device(&self) -> &<DefaultBackend as BackendTypes>::Device
pub fn device(&self) -> &<DefaultBackend as BackendTypes>::Device
Access the device the model is running on.