pub struct HdpTopicModel {
pub phi: Vec<Vec<f64>>,
pub theta: Vec<Vec<f64>>,
/* private fields */
}Expand description
Task-API Hierarchical Dirichlet Process topic model.
Provides the interface HdpTopicModel::fit(corpus, vocab_size, config),
.transform(doc), .topics(), and .num_topics_inferred().
Internally delegates to Hdp for the Gibbs sampling loop, then
post-processes to expose phi (topic × word) and theta (document × topic)
arrays.
§Example
use scirs2_text::topic::hdp::{HdpTopicConfig, HdpTopicModel};
let corpus = vec![
vec![0usize, 1, 2],
vec![3usize, 4, 5],
];
let cfg = HdpTopicConfig { n_iter: 10, t_max: 5, burn_in: 2, seed: 0, ..Default::default() };
let model = HdpTopicModel::fit(&corpus, 6, cfg).expect("fit must succeed");
assert!(model.num_topics_inferred() >= 1);Fields§
§phi: Vec<Vec<f64>>φ[k][w] = word probability in topic k. Shape: [active_k × vocab_size].
theta: Vec<Vec<f64>>θ[d][k] = topic proportion for document d. Shape: [n_docs × t_max].
Implementations§
Source§impl HdpTopicModel
impl HdpTopicModel
Sourcepub fn fit(
corpus: &[Vec<usize>],
vocab_size: usize,
config: HdpTopicConfig,
) -> Result<Self, TopicError>
pub fn fit( corpus: &[Vec<usize>], vocab_size: usize, config: HdpTopicConfig, ) -> Result<Self, TopicError>
Fit the HDP topic model to corpus.
§Parameters
corpus: slice of documents, each aVec<usize>of word indices (all must be <vocab_size).vocab_size: vocabulary size.config: hyperparameters and iteration counts.
§Errors
Returns TopicError::EmptyCorpus when corpus is empty, and
TopicError::WordOutOfVocab when any word index ≥ vocab_size.
Sourcepub fn transform(&self, doc: &[usize]) -> Vec<f64>
pub fn transform(&self, doc: &[usize]) -> Vec<f64>
Infer the topic distribution for an unseen document.
Returns a vector of length t_max that sums to 1.0, with each entry
representing the proportion of the document’s content assigned to that
topic.
Word indices ≥ vocab_size are silently skipped.
Sourcepub fn topics(&self) -> &[Vec<f64>]
pub fn topics(&self) -> &[Vec<f64>]
Return references to all (active + inactive) topic-word distributions.
The outer slice has length t_max; the inner slices each have length
vocab_size. Inactive topics have a uniform distribution over the
prior η.
Sourcepub fn num_topics_inferred(&self) -> usize
pub fn num_topics_inferred(&self) -> usize
Number of topics with at least one word token assigned after Gibbs sampling (approximates the model’s belief about how many topics the corpus requires).
Trait Implementations§
Auto Trait Implementations§
impl Freeze for HdpTopicModel
impl RefUnwindSafe for HdpTopicModel
impl Send for HdpTopicModel
impl Sync for HdpTopicModel
impl Unpin for HdpTopicModel
impl UnsafeUnpin for HdpTopicModel
impl UnwindSafe for HdpTopicModel
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.