Expand description
evfmt is both a command-line formatter and a Rust library for
normalizing text/emoji variation selectors.
Most callers will want format_text together with Policy.
§Stability
This library API is experimental. evfmt follows
Cargo’s SemVer compatibility conventions.
§Examples
Use format_text for whole-input canonicalization under one Policy.
In the example below, #\u{FE0E} is NUMBER SIGN followed by VS15, and
\u{00A9} is a bare COPYRIGHT SIGN. Under the default policy,
#\u{FE0E} loses the redundant variation selector, while bare \u{00A9} is
canonicalized to \u{00A9}\u{FE0F} because it is text-default.
use evfmt::{FormatResult, Policy, format_text};
let policy = Policy::default();
assert_eq!(
format_text("#\u{FE0E}", &policy),
FormatResult::Changed("#".to_owned())
);
assert_eq!(
format_text("\u{00A9}", &policy),
FormatResult::Changed("\u{00A9}\u{FE0F}".to_owned())
);
assert_eq!(format_text("\u{2728}", &policy), FormatResult::Unchanged);For interactive repair or editor integrations, scan the input and then work
item-by-item. In the next example, A\u{FE0F} contains an unsanctioned
presentation selector after A, and the caller chooses to apply the
formatter’s fixed repair.
For the built-in evfmt decisions, callers can build repaired output from
the original scanned items without rescanning after each replacement choice.
Walk the original items in order, keeping item.raw for unchanged items and
substituting the selected replacement for findings.
use evfmt::{Policy, ScanKind, scan};
use evfmt::findings::{Violation, analyze_scan_item};
let policy = Policy::default();
let input = "A\u{FE0F}";
let mut items = scan(input);
assert!(matches!(items.next().unwrap().kind, ScanKind::Passthrough));
let item = items.next().unwrap();
assert!(matches!(
item.kind,
ScanKind::UnsanctionedPresentationSelectors(_)
));
let finding = analyze_scan_item(&item, &policy).unwrap();
assert_eq!(finding.violation(), Violation::UnsanctionedSelectorsOnly);
assert!(finding.decision_slots().is_empty());
let repaired = finding.replacement(&[]).unwrap();
assert_eq!(repaired, "");The findings API is the usual entry point for interactive fixing.
It analyzes scanned items under the supplied Policy and returns the
presentation slots that must be chosen for each finding. Fixed repairs have
no slots and use the empty decision vector.
Custom policies can be built from variation_set variation sets. In this example,
rights-marks contains \u{00A9}, so bare COPYRIGHT SIGN is allowed to
remain bare.
use evfmt::{Policy, format_text, variation_set};
let ascii_and_rights_marks = variation_set::ASCII | variation_set::RIGHTS_MARKS;
let policy = Policy::default()
.with_prefer_bare(ascii_and_rights_marks)
.with_bare_as_text(ascii_and_rights_marks);
let formatted = format_text("\u{00A9}", &policy);
assert_eq!(formatted, evfmt::FormatResult::Unchanged);Here “variation-sequence character” means a character listed in Unicode’s
emoji-variation-sequences.txt.
Public module boundaries:
- the crate root is the high-level API: whole-input formatting and convenience analysis helpers
policydefines formatter policy configurationformatterowns whole-text formattingfindingsanalyzes scanned items under policy and reports violations plus available replacementsscannerowns structural tokenization into singletons, keycaps, ZWJ chains, standalone variation selector runs, and passthrough slicesvariation_setdefines the typed variation-set model used by the library policy API
Re-exports§
pub use findings::DecisionSlot;pub use findings::Finding;pub use findings::PrimaryViolation;pub use findings::PrimaryViolationKind;pub use findings::ReplacementDecision;pub use findings::Violation;pub use findings::analyze_scan_item;pub use formatter::FormatResult;pub use formatter::format_text;pub use policy::Policy;pub use scanner::ScanItem;pub use scanner::ScanKind;pub use scanner::Scanner;pub use scanner::scan;pub use variation_set::VariationSet;
Modules§
- findings
- Policy-aware findings for scanned emoji variation structures.
- formatter
- Core formatting engine.
- policy
- Formatter policy configuration.
- scanner
- Sequence-aware scanner for text/emoji variation sequences.
- variation_
set - Finite variation-position sets for formatter policy.