# Sensitive-rs
[](https://github.com/houseme/sensitive-rs/actions?query=workflow%3ABuild)
[](https://crates.io/crates/sensitive-rs)
[](https://docs.rs/sensitive-rs/)
[](./LICENSE-APACHE)
[](https://crates.io/crates/sensitive-rs)
A high-performance Rust crate for multi-pattern string matching, validation, filtering, and replacement.
## Features
- Find all sensitive words: `find_all`
- Validate text contains sensitive words: `validate`
- Remove sensitive words: `filter`
- Replace sensitive words with a character: `replace`
- Multi-algorithm engine: Aho-Corasick, Wu-Manber, Regex
- Noise removal via configurable regex
- Variant detection (拼音、形似字)
- Parallel search with `rayon`
- LRU cache for hot queries
- Batch processing: `find_all_batch`
- Layered matching: `find_all_layered`
- Streaming processing: `find_all_streaming`
## Installation
Add to your `Cargo.toml`:
```toml
[dependencies]
sensitive-rs = "0.8.0"
```
## Quick Start
```rust
use sensitive_rs::Filter;
fn main() {
let mut filter = Filter::new();
filter.add_words(&["rust", "filter", "敏感词"]);
let text = "hello rust, this is a filter demo 包含敏感词";
let found = filter.find_all(text);
println!("Found: {:?}", found);
let cleaned = filter.replace(text, '*');
println!("Cleaned: {}", cleaned);
}
```
## Advanced Usage
Batch processing:
```rust
let texts = vec!["text1", "text2"];
let results = filter.find_all_batch( & texts);
```
Layered matching:
```rust
let layered = filter.find_all_layered("some long text");
```
Streaming large files:
```rust
use std::fs::File;
use std::io::BufReader;
let reader = BufReader::new(File::open("large.txt") ? );
let stream_results = filter.find_all_streaming(reader) ?;
```
## CLI Usage
Install with the `cli` feature:
```toml
[dependencies]
sensitive-rs = { version = "0.8.0", features = ["cli"] }
```
Or install directly:
```sh
cargo install sensitive-rs --features cli
```
Both `sensitive` and `sensitive-rs` commands are available after installation.
### Commands
```sh
# Find sensitive words
sensitive check "含有赌博和色情内容"
# Validate (exit 1 if sensitive words found)
sensitive validate "clean text"
# Replace sensitive words
sensitive replace '*' "含有赌博内容"
# Remove sensitive words
sensitive filter "含有赌博内容"
# Read from file
sensitive check --file input.txt
# Pipe from stdin
### Options
- `--dict <path>` — custom dictionary file
- `--dict-all` — use extended dictionary (27k words)
- `--algorithm <algo>` — force algorithm: `aho-corasick`, `wumanber`, `regex`
- `--variant` — enable pinyin and shape variant detection
- `--noise-pattern <regex>` — custom noise removal regex
- `--json` — JSON output format
- `--color` — force colored output
## Documentation
For detailed documentation, please refer to [Documentation](https://docs.rs/sensitive-rs).
## License
Licensed under either of
* Apache License, Version 2.0, [LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0
* MIT license [LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT
at your option.
## Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as
defined in the Apache-2.0 or MIT license, shall be dual licensed as above, without any additional terms or conditions.