pub struct DpoLearnModel<F>{ /* private fields */ }Expand description
汎用 DPO LearnModel
group_id でグループ化された Episode を比較し、DPO 学習用データを生成する。
§設計思想
DPO 学習では「同じ条件で複数回実行した結果を比較」する。
- group_id: 同じ条件での実行グループ(Eval -n 5 で 5 回実行など)
- 成功 Episode と失敗 Episode をペアにして比較
§使用方法
ⓘ
// Eval で group_id 付きの Episode を収集
let episodes: Vec<Episode> = ...;
// DPO ペアを生成
let dpo_learn = DpoLearnModel::new();
let pairs = dpo_learn.build_pairs(&episodes);
// TrainingData に変換
let training_data: Vec<TrainingData> = pairs
.iter()
.filter_map(|pair| dpo_learn.convert_pair(pair).ok())
.collect();Implementations§
Source§impl<F> DpoLearnModel<F>
impl<F> DpoLearnModel<F>
Sourcepub fn with_system_prompt(self, prompt: impl Into<String>) -> Self
pub fn with_system_prompt(self, prompt: impl Into<String>) -> Self
システムプロンプトを設定
Sourcepub fn with_config(self, config: DpoConfig) -> Self
pub fn with_config(self, config: DpoConfig) -> Self
設定を適用
Sourcepub fn with_min_quality_gap(self, gap: f64) -> Self
pub fn with_min_quality_gap(self, gap: f64) -> Self
最小品質差を設定
Sourcepub fn with_max_pairs(self, max: usize) -> Self
pub fn with_max_pairs(self, max: usize) -> Self
最大ペア数を設定
Sourcepub fn build_pairs(&self, episodes: &[Episode]) -> Vec<DpoPair>
pub fn build_pairs(&self, episodes: &[Episode]) -> Vec<DpoPair>
group_id でグループ化された Episode から DPO ペアを生成
Sourcepub fn convert_pair(&self, pair: &DpoPair) -> Result<TrainingData, LearnError>
pub fn convert_pair(&self, pair: &DpoPair) -> Result<TrainingData, LearnError>
DPO ペアを TrainingData に変換
Sourcepub fn convert_pairs(&self, pairs: &[DpoPair]) -> Vec<TrainingData>
pub fn convert_pairs(&self, pairs: &[DpoPair]) -> Vec<TrainingData>
複数のペアを一括変換
Trait Implementations§
Source§impl<F> LearnModel for DpoLearnModel<F>
LearnModel trait の実装(Record ベースの Episode 構築用)
impl<F> LearnModel for DpoLearnModel<F>
LearnModel trait の実装(Record ベースの Episode 構築用)
DPO は通常、既存の Episode を比較するため、build_episodes は空を返す。 実際の DPO ペア生成は build_pairs メソッドを使用。
Source§fn build_episodes(&self, _records: &[Record]) -> Vec<Episode>
fn build_episodes(&self, _records: &[Record]) -> Vec<Episode>
Record のストリームから Episode を構築 Read more
Source§fn evaluate(&self, _context: &EpisodeContext) -> Outcome
fn evaluate(&self, _context: &EpisodeContext) -> Outcome
Records から Success/Failure を判定 Read more
Source§fn convert(&self, _episode: &Episode) -> Result<TrainingData, LearnError>
fn convert(&self, _episode: &Episode) -> Result<TrainingData, LearnError>
Episode を TrainingData に変換
Source§fn convert_batch(&self, episodes: &[Episode]) -> Vec<TrainingData>
fn convert_batch(&self, episodes: &[Episode]) -> Vec<TrainingData>
複数 Episode を一括変換(デフォルト実装)
Source§fn build_episodes_from_actions(&self, actions: &[ActionEvent]) -> Vec<Episode>
fn build_episodes_from_actions(&self, actions: &[ActionEvent]) -> Vec<Episode>
便利メソッド: ActionEvent[] から直接変換
Auto Trait Implementations§
impl<F> Freeze for DpoLearnModel<F>where
F: Freeze,
impl<F> RefUnwindSafe for DpoLearnModel<F>where
F: RefUnwindSafe,
impl<F> Send for DpoLearnModel<F>
impl<F> Sync for DpoLearnModel<F>
impl<F> Unpin for DpoLearnModel<F>where
F: Unpin,
impl<F> UnwindSafe for DpoLearnModel<F>where
F: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more