pub struct ComposingNormalizer { /* private fields */ }
Expand description

A normalizer for performing composing normalization.

Implementations§

source§

impl ComposingNormalizer

source

pub fn try_new_nfc_unstable<D>( data_provider: &D ) -> Result<Self, NormalizerError>where D: DataProvider<CanonicalDecompositionDataV1Marker> + DataProvider<CanonicalDecompositionTablesV1Marker> + DataProvider<CanonicalCompositionsV1Marker> + ?Sized,

NFC constructor.

📚 Help choosing a constructor

⚠️ The bounds on this function may change over time, including in SemVer minor releases.
source

pub fn try_new_nfc_with_any_provider( provider: &impl AnyProvider + ?Sized ) -> Result<Self, NormalizerError>

Creates a new instance using an AnyProvider.

For details on the behavior of this function, see: Self::try_new_nfc_unstable

📚 Help choosing a constructor

source

pub fn try_new_nfc_with_buffer_provider( provider: &impl BufferProvider + ?Sized ) -> Result<Self, NormalizerError>

Enabled with the "serde" feature.

Creates a new instance using a BufferProvider.

For details on the behavior of this function, see: Self::try_new_nfc_unstable

📚 Help choosing a constructor

source

pub fn try_new_nfkc_unstable<D>( data_provider: &D ) -> Result<Self, NormalizerError>where D: DataProvider<CanonicalDecompositionDataV1Marker> + DataProvider<CompatibilityDecompositionSupplementV1Marker> + DataProvider<CanonicalDecompositionTablesV1Marker> + DataProvider<CompatibilityDecompositionTablesV1Marker> + DataProvider<CanonicalCompositionsV1Marker> + ?Sized,

NFKC constructor.

📚 Help choosing a constructor

⚠️ The bounds on this function may change over time, including in SemVer minor releases.
source

pub fn try_new_nfkc_with_any_provider( provider: &impl AnyProvider + ?Sized ) -> Result<Self, NormalizerError>

Creates a new instance using an AnyProvider.

For details on the behavior of this function, see: Self::try_new_nfkc_unstable

📚 Help choosing a constructor

source

pub fn try_new_nfkc_with_buffer_provider( provider: &impl BufferProvider + ?Sized ) -> Result<Self, NormalizerError>

Enabled with the "serde" feature.

Creates a new instance using a BufferProvider.

For details on the behavior of this function, see: Self::try_new_nfkc_unstable

📚 Help choosing a constructor

source

pub fn try_new_uts46_without_ignored_and_disallowed_unstable<D>( data_provider: &D ) -> Result<Self, NormalizerError>where D: DataProvider<CanonicalDecompositionDataV1Marker> + DataProvider<Uts46DecompositionSupplementV1Marker> + DataProvider<CanonicalDecompositionTablesV1Marker> + DataProvider<CompatibilityDecompositionTablesV1Marker> + DataProvider<CanonicalCompositionsV1Marker> + ?Sized,

🚧 [Experimental] UTS 46 constructor

This is a special building block normalization for IDNA that implements parts of the Map step and the following Normalize step. The caller is responsible for performing the “disallowed”, “ignored”, and “deviation” parts of the Map step before passing data to this normalizer such that disallowed and ignored characters aren’t passed to this normalizer.

This is ICU4C’s UTS 46 normalization with two exceptions: characters that UTS 46 disallows and ICU4C maps to U+FFFD and characters that UTS 46 maps to the empty string normalize as in NFC in this normalization. Making the disallowed characters behave like this is beneficial to data size, and this normalizer implementation cannot deal with a character normalizing to the empty string, which doesn’t happen in NFC or NFKC as of Unicode 14.

Warning: In this normalization, U+0345 COMBINING GREEK YPOGEGRAMMENI exhibits a behavior that no character in Unicode exhibits in NFD, NFKD, NFC, or NFKC: Case folding turns U+0345 from a reordered character into a non-reordered character before reordering happens. Therefore, the output of this normalization may differ for different inputs that are canonically equivant with each other if they differ by how U+0345 is ordered relative to other reorderable characters.

NOTE: This method remains experimental until suitability of this feature as part of IDNA processing has been demonstrated.

🚧 This code is experimental; it may change at any time, in breaking or non-breaking ways, including in SemVer minor releases. It can be enabled with the "experimental" Cargo feature of the icu meta-crate. Use with caution. #2614
source

pub fn normalize_iter<I: Iterator<Item = char>>( &self, iter: I ) -> Composition<'_, I>

Wraps a delegate iterator into a composing iterator adapter by using the data already held by this normalizer.

source

pub fn normalize(&self, text: &str) -> String

Normalize a string slice into a String.

source

pub fn is_normalized(&self, text: &str) -> bool

Check whether a string slice is normalized.

source

pub fn normalize_utf16(&self, text: &[u16]) -> Vec<u16>

Normalize a slice of potentially-invalid UTF-16 into a Vec.

Unpaired surrogates are mapped to the REPLACEMENT CHARACTER before normalizing.

source

pub fn is_normalized_utf16(&self, text: &[u16]) -> bool

Checks whether a slice of potentially-invalid UTF-16 is normalized.

Unpaired surrogates are treated as the REPLACEMENT CHARACTER.

source

pub fn normalize_utf8(&self, text: &[u8]) -> String

Normalize a slice of potentially-invalid UTF-8 into a String.

Ill-formed byte sequences are mapped to the REPLACEMENT CHARACTER according to the WHATWG Encoding Standard.

source

pub fn is_normalized_utf8(&self, text: &[u8]) -> bool

Check if a slice of potentially-invalid UTF-8 is normalized.

Ill-formed byte sequences are mapped to the REPLACEMENT CHARACTER according to the WHATWG Encoding Standard before checking.

source

pub fn normalize_to<W: Write + ?Sized>( &self, text: &str, sink: &mut W ) -> Result

Normalize a string slice into a Write sink.

source

pub fn normalize_utf8_to<W: Write + ?Sized>( &self, text: &[u8], sink: &mut W ) -> Result

Normalize a slice of potentially-invalid UTF-8 into a Write sink.

Ill-formed byte sequences are mapped to the REPLACEMENT CHARACTER according to the WHATWG Encoding Standard.

source

pub fn normalize_utf16_to<W: Write16 + ?Sized>( &self, text: &[u16], sink: &mut W ) -> Result

Normalize a slice of potentially-invalid UTF-16 into a Write16 sink.

Unpaired surrogates are mapped to the REPLACEMENT CHARACTER before normalizing.

Trait Implementations§

source§

impl Debug for ComposingNormalizer

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for Twhere T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for Twhere T: ?Sized,

const: unstable · source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for Twhere T: ?Sized,

const: unstable · source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

const: unstable · source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for Twhere U: From<T>,

const: unstable · source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
const: unstable · source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
const: unstable · source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> ErasedDestructor for Twhere T: 'static,

source§

impl<T> MaybeSendSync for T