pub struct TargetEncoder<F> { /* private fields */ }Expand description
An unfitted target encoder.
Takes a matrix of categorical integer features and a continuous (or binary) target vector at fit time. Each category is encoded as the smoothed mean of the target for that category.
§Parameters
smooth— the smoothing strategy (Smooth). The DEFAULT isSmooth::Auto(empirical Bayes), matching scikit-learn’s constructor defaultsmooth="auto"(_target_encoder.py:199).Smooth::Fixedselects the fixed m-estimate; higher values regularise more toward the global mean,Fixed(0)is no smoothing.cv— the number of cross-fitting folds used byfit_transform(default 5, matching scikit-learn’scv=5,_target_encoder.py:200).
§Examples
use ferrolearn_preprocess::target_encoder::TargetEncoder;
use ferrolearn_core::traits::{Fit, Transform};
use ndarray::array;
let enc = TargetEncoder::<f64>::new(1.0);
let x = array![[0usize, 1], [0, 0], [1, 1], [1, 0]];
let y = array![1.0, 2.0, 3.0, 4.0];
let fitted = enc.fit(&x, &y).unwrap();
let out = fitted.transform(&x).unwrap();
assert_eq!(out.shape(), &[4, 2]);Implementations§
Source§impl<F: Float + Send + Sync + 'static> TargetEncoder<F>
impl<F: Float + Send + Sync + 'static> TargetEncoder<F>
Sourcepub fn new(smooth: F) -> Self
pub fn new(smooth: F) -> Self
Create a new TargetEncoder with a FIXED smoothing factor.
This is shorthand for with_smooth with
Smooth::Fixed and cv = 5.
Sourcepub fn with_smooth(smooth: Smooth<F>) -> Self
pub fn with_smooth(smooth: Smooth<F>) -> Self
Create a new TargetEncoder with the given smoothing strategy and
cv = 5 (matching scikit-learn’s default).
Sourcepub fn with_cv(self, cv: usize) -> Self
pub fn with_cv(self, cv: usize) -> Self
Set the number of cross-fitting folds used by
fit_transform.
Source§impl<F: Float + Send + Sync + 'static> TargetEncoder<F>
impl<F: Float + Send + Sync + 'static> TargetEncoder<F>
Sourcepub fn fit_transform(
&self,
x: &Array2<usize>,
y: &Array1<F>,
) -> Result<Array2<F>, FerroError>
pub fn fit_transform( &self, x: &Array2<usize>, y: &Array1<F>, ) -> Result<Array2<F>, FerroError>
Cross-fitting fit_transform: encode each row using encodings learned on
the OTHER folds, preventing target leakage.
Mirrors scikit-learn’s TargetEncoder.fit_transform
(sklearn/preprocessing/_target_encoder.py:232-303): for the
continuous/binary single-output case it uses a deterministic KFold
(cv folds, NO shuffle — ferrolearn exposes no shuffle/random_state,
so this is sklearn’s reproducible shuffle=False path, :262); for each
(train, test) fold it fits the per-feature encodings on the TRAIN rows
(with that fold’s y_train_mean) and writes the TEST rows through those
train-encodings (:277-302). A category unseen in the train fold encodes
to y_train_mean (the count == 0 -> y_mean rule, mirroring
_transform_X_ordinal’s unknown-category fallback, :494-497).
Note fit(X,y).transform(X) does NOT equal fit_transform(X,y)
(:235-238): transform uses the full-data encodings_, fit_transform
is cross-fit.
§Errors
FerroError::InsufficientSamplesif the input has zero rows.FerroError::ShapeMismatchifxrows andylength differ.FerroError::InvalidParameterif aSmooth::Fixedfactor is negative, or ifcv < 2/cvexceeds the sample count (sklearn requirescv >= 2,_target_encoder.py:190, andKFoldrejects more splits than samples,_split.py:408-414).
Trait Implementations§
Source§impl<F: Clone> Clone for TargetEncoder<F>
impl<F: Clone> Clone for TargetEncoder<F>
Source§fn clone(&self) -> TargetEncoder<F>
fn clone(&self) -> TargetEncoder<F>
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreSource§impl<F: Debug> Debug for TargetEncoder<F>
impl<F: Debug> Debug for TargetEncoder<F>
Source§impl<F: Float + Send + Sync + 'static> Default for TargetEncoder<F>
impl<F: Float + Send + Sync + 'static> Default for TargetEncoder<F>
Source§fn default() -> Self
fn default() -> Self
The default uses Smooth::Auto (empirical Bayes) and cv = 5,
matching scikit-learn’s TargetEncoder() (smooth="auto", cv=5,
_target_encoder.py:199-200).
Source§impl<F: Float + Send + Sync + 'static> Fit<ArrayBase<OwnedRepr<usize>, Dim<[usize; 2]>>, ArrayBase<OwnedRepr<F>, Dim<[usize; 1]>>> for TargetEncoder<F>
impl<F: Float + Send + Sync + 'static> Fit<ArrayBase<OwnedRepr<usize>, Dim<[usize; 2]>>, ArrayBase<OwnedRepr<F>, Dim<[usize; 1]>>> for TargetEncoder<F>
Source§fn fit(
&self,
x: &Array2<usize>,
y: &Array1<F>,
) -> Result<FittedTargetEncoder<F>, FerroError>
fn fit( &self, x: &Array2<usize>, y: &Array1<F>, ) -> Result<FittedTargetEncoder<F>, FerroError>
Fit the encoder by computing smoothed target means per category.
§Errors
FerroError::InsufficientSamplesif the input has zero rows.FerroError::ShapeMismatchifxrows andylength differ.FerroError::InvalidParameterifsmoothis negative.
Source§type Fitted = FittedTargetEncoder<F>
type Fitted = FittedTargetEncoder<F>
fit.Source§type Error = FerroError
type Error = FerroError
fit.Auto Trait Implementations§
impl<F> Freeze for TargetEncoder<F>where
F: Freeze,
impl<F> RefUnwindSafe for TargetEncoder<F>where
F: RefUnwindSafe,
impl<F> Send for TargetEncoder<F>where
F: Send,
impl<F> Sync for TargetEncoder<F>where
F: Sync,
impl<F> Unpin for TargetEncoder<F>where
F: Unpin,
impl<F> UnsafeUnpin for TargetEncoder<F>where
F: UnsafeUnpin,
impl<F> UnwindSafe for TargetEncoder<F>where
F: UnwindSafe,
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T> DistributionExt for Twhere
T: ?Sized,
impl<T, U> Imply<T> for U
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§unsafe fn to_subset_unchecked(&self) -> SS
unsafe fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.