encoding_rs_io

Struct DecodeReaderBytesBuilder

Source
pub struct DecodeReaderBytesBuilder { /* private fields */ }
Expand description

A builder for constructing a byte oriented transcoder to UTF-8.

Implementations§

Source§

impl DecodeReaderBytesBuilder

Source

pub fn new() -> DecodeReaderBytesBuilder

Create a new decoder builder with a default configuration.

By default, no explicit encoding is used, but if a UTF-8 or UTF-16 BOM is detected, then an appropriate encoding is automatically detected and transcoding is performed (where invalid sequences map to the Unicode replacement codepoint).

Source

pub fn build<R: Read>(&self, rdr: R) -> DecodeReaderBytes<R, Vec<u8>>

Build a new decoder that wraps the given reader.

Source

pub fn build_with_buffer<R: Read, B: AsMut<[u8]>>( &self, rdr: R, buffer: B, ) -> Result<DecodeReaderBytes<R, B>>

Build a new decoder that wraps the given reader and uses the given buffer internally for transcoding.

This is useful for cases where it is advantageuous to amortize allocation. Namely, this method permits reusing a buffer for subsequent decoders.

This returns an error if the buffer is smaller than 4 bytes (which is too small to hold maximum size of a single UTF-8 encoded codepoint).

Source

pub fn encoding( &mut self, encoding: Option<&'static Encoding>, ) -> &mut DecodeReaderBytesBuilder

Set an explicit encoding to be used by this decoder.

When an explicit encoding is set, BOM sniffing is disabled and the encoding provided will be used unconditionally. Errors in the encoded bytes are replaced by the Unicode replacement codepoint.

By default, no explicit encoding is set.

Source

pub fn utf8_passthru(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder

Enable UTF-8 passthru, even when a UTF-8 BOM is observed.

When an explicit encoding is not set (thereby invoking automatic encoding detection via BOM sniffing), then a UTF-8 BOM will cause UTF-8 transcoding to occur. In particular, if the source contains invalid UTF-8 sequences, then they are replaced with the Unicode replacement codepoint.

This transcoding may not be desirable. For example, the caller may already have its own UTF-8 handling where invalid UTF-8 is appropriately handled, in which case, doing an extra transcoding step is extra and unnecessary work. Enabling this option will prevent that extra transcoding step from occurring. In this case, the bytes emitted by the reader are passed through unchanged (including the BOM) and the caller will be responsible for handling any invalid UTF-8.

§Example

This example demonstrates the effect of enabling this option on data that includes a UTF-8 BOM but also, interestingly enough, subsequently includes invalid UTF-8.

extern crate encoding_rs;
extern crate encoding_rs_io;

use std::error::Error;
use std::io::Read;

use encoding_rs_io::DecodeReaderBytesBuilder;

fn example() -> Result<(), Box<Error>> {
    let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
    let mut decoder = DecodeReaderBytesBuilder::new()
        .utf8_passthru(true)
        .build(source_data);

    let mut dest = vec![];
    decoder.read_to_end(&mut dest)?;
    // Without the passthru option, you'd get "foo\u{FFFD}bar".
    assert_eq!(dest, b"\xEF\xBB\xBFfoo\xFFbar");
    Ok(())
}
Source

pub fn strip_bom(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder

Whether or not to always strip a BOM if one is found.

When this is enabled, if a BOM is found at the beginning of a stream, then it is ignored. This applies even when utf8_passthru is enabled or if bom_sniffing is disabled.

This is disabled by default.

§Example

This example shows how to remove the BOM if it’s present even when utf8_passthru is enabled.

extern crate encoding_rs;
extern crate encoding_rs_io;

use std::error::Error;
use std::io::Read;

use encoding_rs_io::DecodeReaderBytesBuilder;

fn example() -> Result<(), Box<Error>> {
    let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
    let mut decoder = DecodeReaderBytesBuilder::new()
        .utf8_passthru(true)
        .strip_bom(true)
        .build(source_data);

    let mut dest = vec![];
    decoder.read_to_end(&mut dest)?;
    // If `strip_bom` wasn't enabled, then this would include the BOM.
    assert_eq!(dest, b"foo\xFFbar");
    Ok(())
}
Source

pub fn bom_override(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder

Give the highest precedent to the BOM, if one is found.

When this is enabled, and if a BOM is found, then the encoding indicated by that BOM is used even if an explicit encoding has been set via the encoding method.

This does not override utf8_passthru.

This is disabled by default.

Source

pub fn bom_sniffing(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder

Enable BOM sniffing

When this is enabled and an explicit encoding is not set, the decoder will try to detect the encoding with BOM.

When this is disabled and an explicit encoding is not set, the decoder will treat the input as raw bytes. The bytes will be passed through unchanged, including any BOM that may be present.

This is enabled by default.

Trait Implementations§

Source§

impl Clone for DecodeReaderBytesBuilder

Source§

fn clone(&self) -> DecodeReaderBytesBuilder

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for DecodeReaderBytesBuilder

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Default for DecodeReaderBytesBuilder

Source§

fn default() -> DecodeReaderBytesBuilder

Returns the “default value” for a type. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut T)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.