focaccia 1.0.1

no_std implementation of Unicode case folding comparisons
Documentation

focaccia

GitHub Actions Discord Twitter Crate API API trunk

Unicode case folding methods for case-insensitive string comparisons. Used to implement case folding operations on the Symbol and String classes in the Ruby Core implementation in Artichoke Ruby.

Focaccia supports full, ASCII, and Turkic Unicode case folding equality comparisons. ASCII folding supports determining case-insensitive ordering.

One of the most common things that software developers do is "normalize" text for the purposes of comparison. And one of the most basic ways that developers are taught to normalize text for comparison is to compare it in a "case insensitive" fashion. In other cases, developers want to compare strings in a case sensitive manner. Unicode defines upper, lower, and title case properties for characters, plus special cases that impact specific language's use of text. (W3C, Case Folding)

focaccia is a flat Italian bread. The focaccia crate compares UTF-8 strings by flattening them to folded downcase. Artichoke goes well with focaccia.

Usage

Add this to your Cargo.toml:

[dependencies]
focaccia = "1.0"

Then make case insensitive string comparisons like:

use core::cmp::Ordering;
use focaccia::CaseFold;

let fold = CaseFold::Full;
assert_eq!(fold.casecmp("MASSE", "Maße"), Ordering::Equal);
assert_eq!(fold.casecmp("São Paulo", "Sao Paulo"), Ordering::Greater);

assert!(fold.case_eq("MASSE", "Maße"));
assert!(!fold.case_eq("São Paulo", "Sao Paulo"));

For text known to be ASCII, Focaccia can make a more performant comparison check:

use core::cmp::Ordering;
use focaccia::CaseFold;

let fold = CaseFold::Ascii;
assert_eq!(fold.casecmp("Crate: focaccia", "Crate: FOCACCIA"), Ordering::Equal);
assert_eq!(fold.casecmp("Fabled", "failed"), Ordering::Less);

assert!(fold.case_eq("Crate: focaccia", "Crate: FOCACCIA"));
assert!(!fold.case_eq("Fabled", "failed"));

ASCII case comparison can be checked on a byte slice:

use core::cmp::Ordering;
use focaccia::{ascii_casecmp, ascii_case_eq};

assert_eq!(ascii_casecmp(b"Artichoke Ruby", b"artichoke ruby"), Ordering::Equal);
assert!(ascii_case_eq(b"Artichoke Ruby", b"artichoke ruby"));

Turkic case folding is similar to full case folding with additional mappings for dotted and dotless I:

use core::cmp::Ordering;
use focaccia::CaseFold;

let fold = CaseFold::Turkic;
assert!(matches!(fold.casecmp("İstanbul", "istanbul"), Ordering::Equal));
assert!(!matches!(fold.casecmp("İstanbul", "Istanbul"), Ordering::Equal));

assert!(fold.case_eq("İstanbul", "istanbul"));
assert!(!fold.case_eq("İstanbul", "Istanbul"));

Implementation

Focaccia generates conversion tables from Unicode data files. Focaccia implements case folding as defined in Unicode 13 (see CaseFolding.txt).

no_std

Focaccia is no_std compatible with an optional and enabled by default dependency on std. Unlike unicase, Focaccia does not link to alloc in its no_std configuration.

Crate features

All features are enabled by default.

  • std - Enable linking to the Rust Standard Library. Enabling this feature adds Error implementations to error types in this crate.

License

focaccia is licensed under the MIT License (c) Ryan Lopopolo.