pub struct TfidfFeatureExtractor { /* private fields */ }Expand description
TF-IDF feature extractor for commit messages
This extractor converts commit messages into TF-IDF feature vectors for ML classification. Implements Phase 2 of nlp-models-techniques-spec.md (Tier 2: TF-IDF + ML).
§Examples
use organizational_intelligence_plugin::nlp::TfidfFeatureExtractor;
let messages: Vec<String> = vec![
"fix: null pointer dereference".to_string(),
"fix: race condition in mutex".to_string(),
"feat: add new feature".to_string(),
];
let mut extractor = TfidfFeatureExtractor::new(1500);
let features = extractor.fit_transform(&messages).unwrap();
assert_eq!(features.n_rows(), 3); // 3 documentsImplementations§
Source§impl TfidfFeatureExtractor
impl TfidfFeatureExtractor
Sourcepub fn fit_transform(&mut self, messages: &[String]) -> Result<Matrix<f64>>
pub fn fit_transform(&mut self, messages: &[String]) -> Result<Matrix<f64>>
Fit the vectorizer on training messages and transform them to TF-IDF features
§Arguments
messages- Training commit messages
§Returns
Ok(Matrix<f64>)- TF-IDF feature matrix (n_messages × vocabulary_size)Err- If vectorization fails
§Examples
use organizational_intelligence_plugin::nlp::TfidfFeatureExtractor;
let messages: Vec<String> = vec![
"fix: memory leak".to_string(),
"fix: race condition".to_string(),
];
let mut extractor = TfidfFeatureExtractor::new(1000);
let features = extractor.fit_transform(&messages).unwrap();
assert_eq!(features.n_rows(), 2);Sourcepub fn fit(&mut self, messages: &[String]) -> Result<()>
pub fn fit(&mut self, messages: &[String]) -> Result<()>
Fit the vectorizer on training messages
§Arguments
messages- Training commit messages
§Examples
use organizational_intelligence_plugin::nlp::TfidfFeatureExtractor;
let messages = vec![
"fix: memory leak".to_string(),
"fix: race condition".to_string(),
];
let mut extractor = TfidfFeatureExtractor::new(1000);
extractor.fit(&messages).unwrap();Sourcepub fn transform(&self, messages: &[String]) -> Result<Matrix<f64>>
pub fn transform(&self, messages: &[String]) -> Result<Matrix<f64>>
Transform messages to TF-IDF features using fitted vocabulary
§Arguments
messages- Commit messages to transform
§Returns
Ok(Matrix<f64>)- TF-IDF feature matrixErr- If transformation fails
§Examples
use organizational_intelligence_plugin::nlp::TfidfFeatureExtractor;
let train_messages = vec![
"fix: memory leak".to_string(),
"fix: race condition".to_string(),
];
let test_messages = vec!["fix: null pointer".to_string()];
let mut extractor = TfidfFeatureExtractor::new(1000);
extractor.fit(&train_messages).unwrap();
let features = extractor.transform(&test_messages).unwrap();
assert_eq!(features.n_rows(), 1);Sourcepub fn vocabulary_size(&self) -> usize
pub fn vocabulary_size(&self) -> usize
Get the vocabulary size (number of features)
§Returns
usize- Number of features in vocabulary
§Examples
use organizational_intelligence_plugin::nlp::TfidfFeatureExtractor;
let messages = vec![
"fix: bug".to_string(),
"feat: feature".to_string(),
];
let mut extractor = TfidfFeatureExtractor::new(1000);
extractor.fit(&messages).unwrap();
assert!(extractor.vocabulary_size() > 0);
assert!(extractor.vocabulary_size() <= 1000);Sourcepub fn max_features(&self) -> usize
pub fn max_features(&self) -> usize
Auto Trait Implementations§
impl Freeze for TfidfFeatureExtractor
impl !RefUnwindSafe for TfidfFeatureExtractor
impl !Send for TfidfFeatureExtractor
impl !Sync for TfidfFeatureExtractor
impl Unpin for TfidfFeatureExtractor
impl !UnwindSafe for TfidfFeatureExtractor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
Converts
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<T> PolicyExt for Twhere
T: ?Sized,
impl<T> PolicyExt for Twhere
T: ?Sized,
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
The inverse inclusion map: attempts to construct
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
Checks if
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
Use with care! Same as
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
The inclusion map: converts
self to the equivalent element of its superset.