pub struct UConverter(/* private fields */);
Expand description

The converter type that provides conversion to/from UTF-16.

This object can perform conversion of single strings (using the UConverter::convert_to_uchars and UConverter::convert_from_uchars functions) or of a stream of text data (using the UConverter::feed_to_uchars and UConverter::feed_from_uchars functions to feed and process more input data).

Each conversion direction has separate state, which means you can use the feed_to_uchars function at the same time as the feed_from_uchars to process two streams simultaneously in the same thread (note that this type isn’t Sync).

§Examples

§Single-string conversion

The single-string conversion functions are straightforward to use.

let mut converter = UConverter::open("UTF-8").unwrap();

let utf8_string = "スーパー";
let utf16_string: Vec<u16> = converter.convert_to_uchars(utf8_string.as_bytes()).unwrap();

assert_eq!(
    utf8_string,
    std::str::from_utf8(&converter.convert_from_uchars(&utf16_string).unwrap()).unwrap()
);

§Streaming conversion

The feeding/streaming functions take a mutable slice as destination buffer and an immutable slice as source buffer. These functions consume the source buffer and write the converted text into the destination buffer until one or the other have been fully consumed, or some conversion error happens. These functions return the error (if any) and how much of the destination/source buffers has been consumed. The idea is that after one of the buffer has been fully consumed, you grab another buffer chunk (whether source or destination) and call the function again. Hence, a processing loop might look like this:

use rust_icu_common as common;
use rust_icu_sys as sys;

let mut converter = UConverter::open("UTF-8").unwrap();

let mut dst: &mut [u16] = get_dst_chunk().unwrap();
let mut src: &[u8] = get_src_chunk().unwrap();

// reset any previous state
converter.reset_to_uchars();
loop {
    let res = converter.feed_to_uchars(dst, src);
    match res.result {
        Ok(_) | Err(common::Error::Sys(sys::UErrorCode::U_BUFFER_OVERFLOW_ERROR)) => {
            dst = dst.split_at_mut(res.dst_consumed).1;
            src = src.split_at(res.src_consumed).1;
        }
        _ => panic!("conversion error"),
    }

    if dst.is_empty() {
        dst = get_dst_chunk().unwrap();
    }
    if src.is_empty() {
        src = match get_src_chunk() {
            None => break,
            Some(src) => src,
        };
    }
}

Implementations§

source§

impl UConverter

source

pub fn open(name: &str) -> Result<Self, Error>

Attempts to open a converter with the given encoding name.

This function wraps around ucnv_open.

source

pub fn try_clone(&self) -> Result<Self, Error>

Attempts to clone a given converter.

This function wraps around ucnv_safeClone.

source

pub fn has_ambiguous_mappings(&self) -> bool

Determines whether the converter contains ambiguous mappings of the same character.

This function wraps around ucnv_isAmbiguous.

source

pub fn name(&self) -> Result<&str, Error>

Attempts to get the canonical name of the converter.

This function wraps around ucnv_getName.

source

pub fn reset(&mut self)

Resets the converter to a default state.

This is equivalent to calling both UConverter::reset_to_uchars and UConverter::reset_from_uchars.

This function wraps around ucnv_reset.

source

pub fn reset_to_uchars(&mut self)

Resets the *_to_uchars part of the converter to a default state.

It is necessary to call this function when you want to start processing a new data stream using UConverter::feed_to_uchars.

This function wraps around ucnv_resetToUnicode.

source

pub fn reset_from_uchars(&mut self)

Resets the *_from_uchars part of the converter to a default state.

It is necessary to call this function when you want to start processing a new data stream using UConverter::feed_from_uchars.

This function wraps around ucnv_resetFromUnicode.

source

pub fn feed_to_uchars(&mut self, dst: &mut [UChar], src: &[u8]) -> FeedResult

Feeds more encoded data to be decoded to UTF-16 and put in the provided destination buffer.

Make sure to call UConverter::reset_to_uchars before processing a new data stream.

This function wraps around ucnv_toUnicode.

source

pub fn feed_from_uchars(&mut self, dst: &mut [u8], src: &[UChar]) -> FeedResult

Feeds more UTF-16 to be encoded and put in the provided destination buffer.

Make sure to call UConverter::reset_from_uchars before processing a new data stream.

This function wraps around ucnv_fromUnicode.

source

pub fn convert_to_uchars(&mut self, src: &[u8]) -> Result<Vec<UChar>, Error>

Performs single-string conversion from an encoded string to a UTF-16 string.

Note that this function resets the *_to_uchars state before conversion.

source

pub fn convert_from_uchars(&mut self, src: &[UChar]) -> Result<Vec<u8>, Error>

Performs single-string conversion from a UTF-16 string to an encoded string.

Note that this function resets the *_from_uchars state before conversion.

Trait Implementations§

source§

impl Debug for UConverter

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl Send for UConverter

Auto Trait Implementations§

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.