car-inference 0.1.1

Local model inference for CAR — Candle backend with Qwen3 models
Documentation
# car-inference

Local and remote model inference for the [Common Agent Runtime](https://github.com/Parslee-ai/car).

## What it does

Provides on-device inference using Candle (Metal on macOS, CUDA on Linux) with Qwen3 models
downloaded from HuggingFace on first use. Also supports remote APIs (OpenAI, Anthropic, Google)
via the same typed `ModelSchema` interface. The `AdaptiveRouter` selects the best model using
a filter-score-explore strategy, learning from outcomes over time via `OutcomeTracker`.

## Usage

```rust
use car_inference::{InferenceEngine, InferenceConfig, GenerateRequest, GenerateParams};

let engine = InferenceEngine::new(InferenceConfig::default());
let result = engine.generate(GenerateRequest {
    prompt: "Explain quicksort".into(),
    params: GenerateParams::default(),
    ..Default::default()
}).await?;
```

## Crate features

- `metal` -- Apple Silicon GPU acceleration
- `cuda` -- NVIDIA GPU acceleration
- `ast` -- AST-aware code generation via `car-ast`

Part of [CAR](https://github.com/Parslee-ai/car) -- see the main repo for full documentation.