candle-coreml 0.2.4

CoreML inference engine for Candle tensors - provides Apple CoreML/ANE integration with real tokenization, safety fixes, and model calibration awareness
Documentation
model_info:
  name: anemll-Qwen-Qwen3-0.6B-ctx512
  version: 0.3.4
  description: |
    Demonstarates running Qwen-Qwen3-0.6B on Apple Neural Engine
    Context length: 512
    Batch size: 64
    Chunks: 1
  license: MIT
  author: Anemll
  framework: Core ML
  language: Python
  architecture: qwen3
  parameters:
    context_length: 512
    batch_size: 64
    lut_embeddings: none
    lut_ffn: 8
    lut_lmhead: 8
    num_chunks: 1
    model_prefix: qwen
    embeddings: qwen_embeddings.mlmodelc
    lm_head: qwen_lm_head_lut8.mlmodelc
    ffn: qwen_FFN_PF_lut8.mlmodelc
    split_lm_head: 16