Struct jpreprocess::JPreprocess
source · pub struct JPreprocess { /* private fields */ }Implementations§
source§impl JPreprocess
impl JPreprocess
sourcepub fn from_config(config: JPreprocessConfig) -> JPreprocessResult<Self>
pub fn from_config(config: JPreprocessConfig) -> JPreprocessResult<Self>
Loads the dictionary from JPreprocessConfig.
This supports importing files and built-in dictionary (needs feature).
If you need to import from data, please use with_dictionaries instead.
Example 1: Load from file
use jpreprocess::*;
let config = JPreprocessConfig {
dictionary: SystemDictionaryConfig::File(path),
user_dictionary: None,
};
let jpreprocess = JPreprocess::from_config(config)?;Example 2: Load bundled dictionary (This requires a feature to be enabled)
use jpreprocess::{*, kind::*};
let config = JPreprocessConfig {
dictionary: SystemDictionaryConfig::Bundled(JPreprocessDictionaryKind::NaistJdic),
user_dictionary: None,
};
let jpreprocess = JPreprocess::from_config(config)?;sourcepub fn with_dictionaries(
dictionary: Dictionary,
user_dictionary: Option<UserDictionary>
) -> Self
pub fn with_dictionaries( dictionary: Dictionary, user_dictionary: Option<UserDictionary> ) -> Self
Creates JPreprocess with provided dictionary data.
sourcepub fn new(
dictionary: Dictionary,
user_dictionary: Option<UserDictionary>
) -> Self
👎Deprecated since 0.5.0: please use with_dictionaries instead
pub fn new( dictionary: Dictionary, user_dictionary: Option<UserDictionary> ) -> Self
with_dictionaries insteadAlias of with_dictionaries.
Note: new before v0.2.0 has moved to from_config.
sourcepub fn text_to_njd(&self, text: &str) -> JPreprocessResult<NJD>
pub fn text_to_njd(&self, text: &str) -> JPreprocessResult<NJD>
Tokenize input text and return NJD.
Useful for customizing text processing.
use jpreprocess::*;
use jpreprocess_jpcommon::*;
let jpreprocess = JPreprocess::from_config(config)?;
let mut njd = jpreprocess.text_to_njd("日本語文を解析し、音声合成エンジンに渡せる形式に変換します.")?;
njd.preprocess();
// jpcommon utterance
let utterance = Utterance::from(njd.nodes.as_slice());
// Vec<([phoneme string], [context labels])>
let phoneme_vec = utterance_to_phoneme_vec(&utterance);
assert_eq!(&phoneme_vec[2].0, "i");
assert!(phoneme_vec[2].1.starts_with("/A:-3+1+7"));
// fullcontext label
let fullcontext = overwrapping_phonemes(phoneme_vec);
assert!(fullcontext[2].starts_with("sil^n-i+h=o"));sourcepub fn run_frontend(&self, text: &str) -> JPreprocessResult<Vec<String>>
pub fn run_frontend(&self, text: &str) -> JPreprocessResult<Vec<String>>
Tokenize a text, preprocess, and return NJD converted to string.
The returned string does not match that of openjtalk. JPreprocess drops orig string and some of the CForm information, which is unnecessary to preprocessing.
If you need these infomation, please raise a feature request as an issue.
sourcepub fn make_label(&self, njd_features: Vec<String>) -> Vec<String>
pub fn make_label(&self, njd_features: Vec<String>) -> Vec<String>
Generate jpcommon features from NJD features(returned by run_frontend).
sourcepub fn extract_fullcontext(&self, text: &str) -> JPreprocessResult<Vec<String>>
pub fn extract_fullcontext(&self, text: &str) -> JPreprocessResult<Vec<String>>
Generate jpcommon features from a text.
This is not guaranteed to be same as calling run_frontend and make_label.