pub struct CoreMLModel { /* private fields */ }Expand description
CoreML model wrapper that provides Candle tensor integration
Implementations§
Source§impl CoreMLModel
impl CoreMLModel
Sourcepub fn load<P: AsRef<Path>>(path: P) -> Result<Self, CandleError>
pub fn load<P: AsRef<Path>>(path: P) -> Result<Self, CandleError>
Load a CoreML model from a .mlmodelc directory with default configuration
Sourcepub fn load_with_function<P: AsRef<Path>>(
path: P,
config: &Config,
function_name: &str,
) -> Result<Self, CandleError>
pub fn load_with_function<P: AsRef<Path>>( path: P, config: &Config, function_name: &str, ) -> Result<Self, CandleError>
Load a CoreML model with a specific function name
Sourcepub fn load_from_file<P: AsRef<Path>>(
path: P,
config: &Config,
) -> Result<Self, CandleError>
pub fn load_from_file<P: AsRef<Path>>( path: P, config: &Config, ) -> Result<Self, CandleError>
Load a CoreML model from a .mlmodelc directory following standard Candle patterns
Note: Unlike other Candle models, CoreML models are pre-compiled and don’t use VarBuilder. This method provides a Candle-compatible interface while loading from CoreML files.
Sourcepub fn load_from_file_with_function<P: AsRef<Path>>(
path: P,
config: &Config,
function_name: Option<&str>,
) -> Result<Self, CandleError>
pub fn load_from_file_with_function<P: AsRef<Path>>( path: P, config: &Config, function_name: Option<&str>, ) -> Result<Self, CandleError>
Load a CoreML model with optional function name specification
Sourcepub fn forward_single(&self, input: &Tensor) -> Result<Tensor, CandleError>
pub fn forward_single(&self, input: &Tensor) -> Result<Tensor, CandleError>
Run forward pass through the model with multiple inputs
Accepts tensors from CPU or Metal devices, rejects CUDA tensors. Returns output tensor on the same device as the input tensors.
§Arguments
inputs- Slice of tensors corresponding to the input_names in config order
Convenience method for single-input models (backward compatibility)
pub fn forward(&self, inputs: &[&Tensor]) -> Result<Tensor, CandleError>
Sourcepub fn forward_all(
&self,
inputs: &[&Tensor],
) -> Result<HashMap<String, Tensor>, CandleError>
pub fn forward_all( &self, inputs: &[&Tensor], ) -> Result<HashMap<String, Tensor>, CandleError>
Forward pass returning all outputs as a HashMap
This is useful for models that have multiple outputs, such as the Qwen LM head which produces 16 different logits chunks that need to be concatenated.
Sourcepub fn make_state(&self) -> Result<CoreMLState, CandleError>
pub fn make_state(&self) -> Result<CoreMLState, CandleError>
Create a fresh state object for this model.
This enables efficient autoregressive generation by maintaining persistent KV-cache across multiple prediction calls.
§Returns
A new CoreMLState instance that can be used with predict_with_state().
For stateless models, this returns an empty state object that can still
be used with stateful prediction methods (resulting in stateless behavior).
§Example
use candle_core::{Device, Tensor};
use candle_coreml::{CoreMLModel, Config};
let model = CoreMLModel::load("model.mlmodelc")?;
// Create state for efficient token generation
let mut state = model.make_state()?;
// Use state with predict_with_state() for streaming inferenceSourcepub fn predict_with_state(
&self,
inputs: &[&Tensor],
state: &mut CoreMLState,
) -> Result<Tensor, CandleError>
pub fn predict_with_state( &self, inputs: &[&Tensor], state: &mut CoreMLState, ) -> Result<Tensor, CandleError>
Run forward pass through the model with persistent state.
This method enables efficient autoregressive generation by maintaining
KV-cache state across multiple prediction calls. Unlike the stateless
forward() method, this preserves computation state between calls.
§Arguments
inputs- Slice of tensors corresponding to input_names in config orderstate- Mutable reference to the model state (will be updated)
§Returns
Output tensor on the same device as the input tensors.
§Device Compatibility
Accepts tensors from CPU or Metal devices, rejects CUDA tensors.
§Example
use candle_core::{Device, Tensor};
use candle_coreml::{CoreMLModel, Config};
let model = CoreMLModel::load("model.mlmodelc")?;
let device = Device::Cpu;
let mut state = model.make_state()?;
// Generate tokens with persistent KV-cache
for i in 0..10 {
let input = Tensor::ones((1, 1), candle_core::DType::I64, &device)?;
let output = model.predict_with_state(&[&input], &mut state)?;
println!("Token {}: {:?}", i, output);
}Trait Implementations§
Auto Trait Implementations§
impl Freeze for CoreMLModel
impl RefUnwindSafe for CoreMLModel
impl Send for CoreMLModel
impl Sync for CoreMLModel
impl Unpin for CoreMLModel
impl UnwindSafe for CoreMLModel
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more