candle-mi 0.1.12

Mechanistic interpretability for language models in Rust, built on candle
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// SPDX-License-Identifier: MIT OR Apache-2.0

//! Steering-vector **construction** methods.
//!
//! This module is a sibling of [`crate::interp::steering`], which handles
//! steering-vector **calibration** (dose-response curves, absorption-boundary
//! detection) for an already-built vector.  `crate::steering` covers how the
//! vector is constructed in the first place from model activations.
//!
//! ## Submodules
//!
//! - [`contrastive`] — Maar et al. (2026) "What's the plan?"
//!   mean-of-differences contrastive activation steering:
//!   `d = mean(positive_residuals) − mean(negative_residuals)`,
//!   optionally L2-normalised, applied additively via
//!   [`Intervention::Add`](crate::Intervention::Add).

pub mod contrastive;