Struct rust_icu_ucnv::UConverter
source · pub struct UConverter(/* private fields */);
Expand description
The converter type that provides conversion to/from UTF-16.
This object can perform conversion of single strings (using the UConverter::convert_to_uchars and UConverter::convert_from_uchars functions) or of a stream of text data (using the UConverter::feed_to_uchars and UConverter::feed_from_uchars functions to feed and process more input data).
Each conversion direction has separate state, which means you can use the feed_to_uchars
function at the same time as the feed_from_uchars
to process two streams simultaneously in
the same thread (note that this type isn’t Sync).
§Examples
§Single-string conversion
The single-string conversion functions are straightforward to use.
let mut converter = UConverter::open("UTF-8").unwrap();
let utf8_string = "スーパー";
let utf16_string: Vec<u16> = converter.convert_to_uchars(utf8_string.as_bytes()).unwrap();
assert_eq!(
utf8_string,
std::str::from_utf8(&converter.convert_from_uchars(&utf16_string).unwrap()).unwrap()
);
§Streaming conversion
The feeding/streaming functions take a mutable slice as destination buffer and an immutable slice as source buffer. These functions consume the source buffer and write the converted text into the destination buffer until one or the other have been fully consumed, or some conversion error happens. These functions return the error (if any) and how much of the destination/source buffers has been consumed. The idea is that after one of the buffer has been fully consumed, you grab another buffer chunk (whether source or destination) and call the function again. Hence, a processing loop might look like this:
use rust_icu_common as common;
use rust_icu_sys as sys;
let mut converter = UConverter::open("UTF-8").unwrap();
let mut dst: &mut [u16] = get_dst_chunk().unwrap();
let mut src: &[u8] = get_src_chunk().unwrap();
// reset any previous state
converter.reset_to_uchars();
loop {
let res = converter.feed_to_uchars(dst, src);
match res.result {
Ok(_) | Err(common::Error::Sys(sys::UErrorCode::U_BUFFER_OVERFLOW_ERROR)) => {
dst = dst.split_at_mut(res.dst_consumed).1;
src = src.split_at(res.src_consumed).1;
}
_ => panic!("conversion error"),
}
if dst.is_empty() {
dst = get_dst_chunk().unwrap();
}
if src.is_empty() {
src = match get_src_chunk() {
None => break,
Some(src) => src,
};
}
}
Implementations§
source§impl UConverter
impl UConverter
sourcepub fn open(name: &str) -> Result<Self, Error>
pub fn open(name: &str) -> Result<Self, Error>
Attempts to open a converter with the given encoding name.
This function wraps around ucnv_open
.
sourcepub fn try_clone(&self) -> Result<Self, Error>
pub fn try_clone(&self) -> Result<Self, Error>
Attempts to clone a given converter.
This function wraps around ucnv_safeClone
.
sourcepub fn has_ambiguous_mappings(&self) -> bool
pub fn has_ambiguous_mappings(&self) -> bool
Determines whether the converter contains ambiguous mappings of the same character.
This function wraps around ucnv_isAmbiguous
.
sourcepub fn name(&self) -> Result<&str, Error>
pub fn name(&self) -> Result<&str, Error>
Attempts to get the canonical name of the converter.
This function wraps around ucnv_getName
.
sourcepub fn reset(&mut self)
pub fn reset(&mut self)
Resets the converter to a default state.
This is equivalent to calling both UConverter::reset_to_uchars and UConverter::reset_from_uchars.
This function wraps around ucnv_reset
.
sourcepub fn reset_to_uchars(&mut self)
pub fn reset_to_uchars(&mut self)
Resets the *_to_uchars
part of the converter to a default state.
It is necessary to call this function when you want to start processing a new data stream using UConverter::feed_to_uchars.
This function wraps around ucnv_resetToUnicode
.
sourcepub fn reset_from_uchars(&mut self)
pub fn reset_from_uchars(&mut self)
Resets the *_from_uchars
part of the converter to a default state.
It is necessary to call this function when you want to start processing a new data stream using UConverter::feed_from_uchars.
This function wraps around ucnv_resetFromUnicode
.
sourcepub fn feed_to_uchars(&mut self, dst: &mut [UChar], src: &[u8]) -> FeedResult
pub fn feed_to_uchars(&mut self, dst: &mut [UChar], src: &[u8]) -> FeedResult
Feeds more encoded data to be decoded to UTF-16 and put in the provided destination buffer.
Make sure to call UConverter::reset_to_uchars before processing a new data stream.
This function wraps around ucnv_toUnicode
.
sourcepub fn feed_from_uchars(&mut self, dst: &mut [u8], src: &[UChar]) -> FeedResult
pub fn feed_from_uchars(&mut self, dst: &mut [u8], src: &[UChar]) -> FeedResult
Feeds more UTF-16 to be encoded and put in the provided destination buffer.
Make sure to call UConverter::reset_from_uchars before processing a new data stream.
This function wraps around ucnv_fromUnicode
.