budouy
Rust port of BudouX with optional HTML processing, WebAssembly support, and a small CLI.
Features
std: default feature for std-enabled builds.alloc: no_std-compatible build using alloc and hashbrown.vendored-models: bundles default Japanese, Simplified Chinese, Traditional Chinese, and Thai models.html: enables HTML processing utilities based onkuchikikiki(requiresstd).cli: enables thebudouyCLI (requiresstd, impliesvendored-models).wasm: enables WebAssembly bindings viawasm-bindgen(impliesallocandvendored-models).
Note: std and alloc are mutually exclusive.
Usage
Library
Custom model:
use HashMap;
use ;
use FeatureKey;
let mut model: Model = new;
model.insert;
let parser = new;
let chunks = parser.parse;
assert_eq!;
Default model (requires vendored-models):
use load_default_japanese_parser;
let parser = load_default_japanese_parser;
let chunks = parser.parse;
println!;
HTML processing (requires html + vendored-models):
use HTMLProcessingParser;
use load_default_japanese_parser;
let parser = load_default_japanese_parser;
let html_parser = new;
let input = "今日は<strong>良い</strong>天気です";
let output = html_parser.translate_html_string;
println!;
WebAssembly
Build for web (requires wasm-pack):
Use from JavaScript:
import init from './pkg/budouy.js';
await ;
const parser = ;
const chunks = parser.;
console.log; // ["今日は", "良い", "天気です"]
// Other languages
const zhHans = ;
const zhHant = ;
const thai = ;
CLI
Build and run the CLI (requires cli):
Use a custom model JSON:
Read from stdin:
|
no_std
This crate supports no_std with alloc. Disable default features and enable alloc:
= { = "0.1", = false, = ["alloc"] }
std and alloc are mutually exclusive. The html and cli features require std.
Models
Vendored models in src/models/*.json are derived from the original BudouX
project (Google) and are licensed under Apache-2.0. See LICENSE for details.
This project is not affiliated with Google.
License
Apache-2.0. See LICENSE.