Struct TransformOptions

Source
#[non_exhaustive]
pub struct TransformOptions { pub unassigned_codepoint_handling: UnassignedCodepointHandling, pub ignore: bool, pub case_fold: bool, pub grapheme_boundary_markers: bool, pub compat: bool, pub composition: Option<CompositionOptions>, pub lump: bool, pub nlf_conversion: Option<NlfConversionMode>, pub strip_control_codes: bool, pub stable: bool, }
Expand description

Options for the map, decompose_buffer, and decompose_char functions.

Used to flexibly support multiple transformations through a single interface.

Some options are specific to composition/decomposition, and are stored in CompositionOptions.

§Limitation

Certain options are only supported in the advanced interface, because they have the potential to produce invalid UTF8.

This currently includes the grapheme_boundary_markers option, and unassigned_codepoint_handling set to UnassignedCodepointHandling::Allow.

Fields (Non-exhaustive)§

This struct is marked as non-exhaustive
Non-exhaustive structs could have additional fields added in future. Therefore, non-exhaustive structs cannot be constructed in external crates using the traditional Struct { .. } syntax; cannot be matched against without a wildcard ..; and struct update syntax will not work.
§unassigned_codepoint_handling: UnassignedCodepointHandling

Specify how to handle unassigned codepoints.

By default, this is set to UnassignedCodepointHandling::Forbid.

§ignore: bool

Strip “default ignorable characters” such as SOFT-HYPHEN or ZERO-WIDTH-SPACE..

This is equivalent to the UTF8PROC_IGNORE option in the C library.

§case_fold: bool

Apply Unicode case-folding, to be able to do a case-insensitive string comparison.

This is equivalent to the UTF8PROC_CASEFOLD option in the C library.

§grapheme_boundary_markers: bool

Inserts marker values at the beginning of each sequence which is representing a single grapheme cluster (see UAX#29)..

This is only usable in the advanced interface, because it produces invalid UTF8 or codepoints. Using this option in the simple interface will panic.

The same functionality is also available through the crate::grapheme module.

This is equivalent to the UTF8PROC_CHARBOUND option in the C library.

§compat: bool

Replace certain characters with their compatibility decomposition.

This is used to implement NFKD and NFKC Unicode normalization.

This is equivalent to the UTF8PROC_COMPAT option in the C library.

§composition: Option<CompositionOptions>

If not None, enables composition/decomposition of control characters.

Use CompositionOptions::compose and CompositionOptions::decompose for default compose/decompose options.

Equivalent to either UTF8PROC_COMPOSE or UTF8PROC_DECOMPOSE in the C library, depending on the CompositionDirection.

§lump: bool

Lump certain characters together.

For example, HYPHEN U+2010 and MINUS U+2212 are converted to ASCII “-”. Documented in lump.md in the utf8proc repository (link valid as of version v2.10.0).

If the nlf_conversion option is set, this includes a transformation of paragraph and line separators to ASCII line-feed (LF).

§nlf_conversion: Option<NlfConversionMode>

Customize the conversion of NLF-sequences (LF, CRLF, CR, NEL).

If this is None, no conversions are applied. Can be used to customize the strip_control_codes option.

§strip_control_codes: bool

Strips and/or converts control characters.

NLF-sequences are transformed into spaces, except if of the nlf_conversion option is specified. HorizontalTab (HT) and FormFeed (FF) are treated as a NLF-sequence in this case. All other control characters are simply removed.

§stable: bool

Prohibit combining characters that would violate Unicode versioning stability.

Trait Implementations§

Source§

impl Clone for TransformOptions

Source§

fn clone(&self) -> TransformOptions

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for TransformOptions

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for TransformOptions

Source§

fn default() -> TransformOptions

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.