Crate word_filter[][src]

A Word Filter for filtering text.

A Word Filter is a system for identifying and censoring specific words or phrases in strings. Common usage includes censoring vulgar or profane language and preventing spam or vandelism in user-provided content.

The Word Filter implementation provided here allows for advanced filtering functionality, including:

  • Finding and censoring filtered words.
  • Ignoring words that are considered “exceptions”.
  • Allowing specification of “aliases”, i.e. strings that can replace other strings (for example, an alias could be created to replace the letter “a” with the character “@”).
  • Ignoring specified separators (such as spaces or other characters) between letters of filtered words.

Usage

A WordFilter can be created using a WordFilterBuilder as follows:

use word_filter::WordFilterBuilder;

let filter = WordFilterBuilder::new()
    .words(&["foo"])
    .exceptions(&["foobar"])
    .separators(&[" ", "_"])
    .aliases(&[("f", "F")])
    .build();

// The word filter will both identify and censor the word "foo".
assert_eq!(filter.censor("Should censor foo"), "Should censor ***");

// The word filter does not identify or censor the exception "foobar".
assert_eq!(filter.censor("Should not censor foobar"), "Should not censor foobar");

// The word filter will ignore separators while matching.
assert_eq!(filter.censor("Should censor f o_o"), "Should censor *****");

// The word filter checks for aliases while matching.
assert_eq!(filter.censor("Should censor Foo"), "Should censor ***");

Modules

censor

Macros for creating censors to be used in a WordFilter.

Macros

_replace_graphemes_with

Creates a censor replacing every grapheme with the given string.

_replace_words_with

Creates a sensor replacing the full matched words with the given string.

Structs

WordFilter

A word filter for identifying filtered words within strings.

WordFilterBuilder

A non-consuming builder for a WordFilter.

Enums

RepeatedCharacterMatchMode

The strategy a WordFilter should use to match repeated characters.