pub struct Qwen2Model { /* private fields */ }Expand description
Complete Qwen2 model for inference.
Assembles embedding, decoder layers, and LM head into a complete model.
Implementations§
Source§impl Qwen2Model
impl Qwen2Model
Sourcepub fn new(config: &Qwen2Config) -> Self
pub fn new(config: &Qwen2Config) -> Self
Create a new Qwen2 model from configuration.
Weights are initialized randomly. Use load() to load pre-trained weights.
Sourcepub fn new_uninitialized(config: &Qwen2Config) -> Self
pub fn new_uninitialized(config: &Qwen2Config) -> Self
Create an uninitialized Qwen2 model with minimal memory allocation.
The model is not ready for inference until weights are loaded.
Sourcepub fn forward_profiled(
&mut self,
input_ids: &[u32],
position_ids: &[usize],
) -> Tensor
pub fn forward_profiled( &mut self, input_ids: &[u32], position_ids: &[usize], ) -> Tensor
Forward with detailed profiling output. Prints timing breakdown for each component.
Sourcepub fn generate(
&mut self,
prompt_ids: &[u32],
max_new_tokens: usize,
temperature: f32,
_top_p: f32,
) -> Vec<u32>
pub fn generate( &mut self, prompt_ids: &[u32], max_new_tokens: usize, temperature: f32, _top_p: f32, ) -> Vec<u32>
Sourcepub fn generate_profiled(
&mut self,
prompt_ids: &[u32],
max_new_tokens: usize,
temperature: f32,
) -> Vec<u32>
pub fn generate_profiled( &mut self, prompt_ids: &[u32], max_new_tokens: usize, temperature: f32, ) -> Vec<u32>
Generate with profiling output (prints timing breakdown).
Sourcepub fn config(&self) -> &Qwen2Config
pub fn config(&self) -> &Qwen2Config
Get model configuration.
Sourcepub fn enable_cache(&mut self)
pub fn enable_cache(&mut self)
Enable KV cache for efficient generation.
Sourcepub fn disable_cache(&mut self)
pub fn disable_cache(&mut self)
Disable KV cache.
Sourcepub fn clear_cache(&mut self)
pub fn clear_cache(&mut self)
Clear KV cache.
Sourcepub fn num_layers(&self) -> usize
pub fn num_layers(&self) -> usize
Get number of layers.
Sourcepub fn weight_names(&self) -> Vec<String>
pub fn weight_names(&self) -> Vec<String>
Get list of weight names following HuggingFace convention.
Returns names like:
model.embed_tokens.weightmodel.layers.0.self_attn.q_proj.weightmodel.norm.weightlm_head.weight
Sourcepub fn weight_info(&self) -> HashMap<String, Vec<usize>>
pub fn weight_info(&self) -> HashMap<String, Vec<usize>>
Get weight shapes as a map from name to shape.
Sourcepub fn weights(&self) -> HashMap<String, Vec<f32>>
pub fn weights(&self) -> HashMap<String, Vec<f32>>
Extract accessible weights as a map from name to f32 data.
Returns a map suitable for serialization to SafeTensors format.
Note: Currently returns weights from components with public accessors.
Full weight export will be enabled when nn modules expose weight accessors.
Sourcepub fn num_parameters(&self) -> usize
pub fn num_parameters(&self) -> usize
Get total number of parameters in the model.
Sourcepub fn embed_tokens_mut(&mut self) -> &mut Embedding
pub fn embed_tokens_mut(&mut self) -> &mut Embedding
Get mutable reference to embedding layer.
Sourcepub fn layer_mut(&mut self, idx: usize) -> Option<&mut Qwen2DecoderLayer>
pub fn layer_mut(&mut self, idx: usize) -> Option<&mut Qwen2DecoderLayer>
Get mutable reference to decoder layer at index.
Sourcepub fn lm_head_mut(&mut self) -> &mut Linear
pub fn lm_head_mut(&mut self) -> &mut Linear
Get mutable reference to language model head.
Sourcepub fn lm_head(&self) -> &Linear
pub fn lm_head(&self) -> &Linear
Get reference to language model head (for testing/inspection).
Sourcepub fn from_safetensors(
config: &Qwen2Config,
path: &Path,
) -> Result<Self, String>
pub fn from_safetensors( config: &Qwen2Config, path: &Path, ) -> Result<Self, String>
Load model from SafeTensors file.
Creates a new model with the given config and loads weights from file.
Sourcepub fn load_from_apr(&mut self, path: &Path) -> Result<usize, String>
pub fn load_from_apr(&mut self, path: &Path) -> Result<usize, String>
Load weights from APR v2 format file.
Per Native Library Mandate (Spec §2.4): Uses mmap via bundle::MappedFile
for zero-copy tensor access. This is the REQUIRED approach for APR files.
Note: APR canonical names don’t have the “model.” prefix (it’s stripped during import per format/converter.rs). We look for names without prefix.
§Returns
Number of weights loaded
§Errors
Returns error if file cannot be read or weights don’t match.
Trait Implementations§
Auto Trait Implementations§
impl Freeze for Qwen2Model
impl !RefUnwindSafe for Qwen2Model
impl Send for Qwen2Model
impl Sync for Qwen2Model
impl Unpin for Qwen2Model
impl !UnwindSafe for Qwen2Model
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more