Skip to main content

Crate mecab_ko_dict_validator

Crate mecab_ko_dict_validator 

Source
Expand description

MeCab-Ko Dictionary Validator

This crate provides comprehensive validation tools for MeCab dictionary files, including checks for CSV format, POS tags, costs, encoding, duplicates, and normalization issues.

§Features

  • CSV format validation
  • POS tag validation with Korean tag support
  • Cost range checking
  • Duplicate entry detection
  • Unicode normalization validation
  • UTF-8 encoding validation
  • Customizable validation rules via configuration files
  • JSON and text report formats

§Example

use mecab_ko_dict_validator::{DictValidator, ValidationConfig};

let validator = DictValidator::with_defaults();
let report = validator.validate_file("dictionary.csv")
    .expect("Failed to validate dictionary");

if report.is_valid() {
    println!("Dictionary is valid!");
} else {
    println!("{}", report.to_text());
}

Re-exports§

pub use analyzer::AnalysisReport;
pub use analyzer::ConsistencyIssues;
pub use analyzer::CostDistribution;
pub use analyzer::DictAnalyzer;
pub use analyzer::HistogramBin;
pub use analyzer::OutlierInfo;
pub use analyzer::PosDistribution;
pub use analyzer::PosTagStat;
pub use analyzer::Recommendation;
pub use analyzer::RecommendationSeverity;
pub use config::load_config;
pub use config::save_config;
pub use config::ConfigError;
pub use report::IssueCategory;
pub use report::Location;
pub use report::Severity;
pub use report::ValidationIssue;
pub use report::ValidationReport;
pub use report::ValidationStatistics;
pub use rules::CostRules;
pub use rules::CsvRules;
pub use rules::DuplicateRules;
pub use rules::EncodingRules;
pub use rules::NormalizationForm;
pub use rules::NormalizationRules;
pub use rules::PosRules;
pub use rules::ValidationConfig;
pub use validator::DictEntry;
pub use validator::DictValidator;
pub use validator::ValidationError;

Modules§

analyzer
Dictionary quality analysis and statistics.
config
Configuration file handling for validation rules.
report
Validation report generation and formatting.
rules
Validation rules for MeCab dictionary entries.
validator
Main dictionary validation logic.