[][src]Struct encoding_rs_io::DecodeReaderBytesBuilder

pub struct DecodeReaderBytesBuilder { /* fields omitted */ }

A builder for constructing a byte oriented transcoder to UTF-8.

Methods

impl DecodeReaderBytesBuilder[src]

pub fn new() -> DecodeReaderBytesBuilder[src]

Create a new decoder builder with a default configuration.

By default, no explicit encoding is used, but if a UTF-8 or UTF-16 BOM is detected, then an appropriate encoding is automatically detected and transcoding is performed (where invalid sequences map to the Unicode replacement codepoint).

Important traits for DecodeReaderBytes<R, B>
pub fn build<R: Read>(&self, rdr: R) -> DecodeReaderBytes<R, Vec<u8>>[src]

Build a new decoder that wraps the given reader.

pub fn build_with_buffer<R: Read, B: AsMut<[u8]>>(
    &self,
    rdr: R,
    buffer: B
) -> Result<DecodeReaderBytes<R, B>>
[src]

Build a new decoder that wraps the given reader and uses the given buffer internally for transcoding.

This is useful for cases where it is advantageuous to amortize allocation. Namely, this method permits reusing a buffer for subsequent decoders.

This returns an error if the buffer is smaller than 4 bytes (which is too small to hold maximum size of a single UTF-8 encoded codepoint).

pub fn encoding(
    &mut self,
    encoding: Option<&'static Encoding>
) -> &mut DecodeReaderBytesBuilder
[src]

Set an explicit encoding to be used by this decoder.

When an explicit encoding is set, BOM sniffing is disabled and the encoding provided will be used unconditionally. Errors in the encoded bytes are replaced by the Unicode replacement codepoint.

By default, no explicit encoding is set.

pub fn utf8_passthru(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder[src]

Enable UTF-8 passthru, even when a UTF-8 BOM is observed.

When an explicit encoding is not set (thereby invoking automatic encoding detection via BOM sniffing), then a UTF-8 BOM will cause UTF-8 transcoding to occur. In particular, if the source contains invalid UTF-8 sequences, then they are replaced with the Unicode replacement codepoint.

This transcoding may not be desirable. For example, the caller may already have its own UTF-8 handling where invalid UTF-8 is appropriately handled, in which case, doing an extra transcoding step is extra and unnecessary work. Enabling this option will prevent that extra transcoding step from occurring. In this case, the bytes emitted by the reader are passed through unchanged (including the BOM) and the caller will be responsible for handling any invalid UTF-8.

Example

This example demonstrates the effect of enabling this option on data that includes a UTF-8 BOM but also, interestingly enough, subsequently includes invalid UTF-8.

extern crate encoding_rs;
extern crate encoding_rs_io;

use std::error::Error;
use std::io::Read;

use encoding_rs_io::DecodeReaderBytesBuilder;

fn example() -> Result<(), Box<Error>> {
    let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
    let mut decoder = DecodeReaderBytesBuilder::new()
        .utf8_passthru(true)
        .build(source_data);

    let mut dest = vec![];
    decoder.read_to_end(&mut dest)?;
    // Without the passthru option, you'd get "foo\u{FFFD}bar".
    assert_eq!(dest, b"\xEF\xBB\xBFfoo\xFFbar");
    Ok(())
}

pub fn strip_bom(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder[src]

Whether or not to always strip a BOM if one is found.

When this is enabled, if a BOM is found at the beginning of a stream, then it is ignored. This applies even when utf8_passthru is enabled or if bom_sniffing is disabled.

This is disabled by default.

Example

This example shows how to remove the BOM if it's present even when utf8_passthru is enabled.

extern crate encoding_rs;
extern crate encoding_rs_io;

use std::error::Error;
use std::io::Read;

use encoding_rs_io::DecodeReaderBytesBuilder;

fn example() -> Result<(), Box<Error>> {
    let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
    let mut decoder = DecodeReaderBytesBuilder::new()
        .utf8_passthru(true)
        .strip_bom(true)
        .build(source_data);

    let mut dest = vec![];
    decoder.read_to_end(&mut dest)?;
    // If `strip_bom` wasn't enabled, then this would include the BOM.
    assert_eq!(dest, b"foo\xFFbar");
    Ok(())
}

pub fn bom_override(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder[src]

Give the highest precedent to the BOM, if one is found.

When this is enabled, and if a BOM is found, then the encoding indicated by that BOM is used even if an explicit encoding has been set via the encoding method.

This does not override utf8_passthru.

This is disabled by default.

pub fn bom_sniffing(&mut self, yes: bool) -> &mut DecodeReaderBytesBuilder[src]

Enable BOM sniffing

When this is enabled and an explicit encoding is not set, the decoder will try to detect the encoding with BOM.

When this is disabled and an explicit encoding is not set, the decoder will treat the input as raw bytes. The bytes will be passed through unchanged, including any BOM that may be present.

This is enabled by default.

Trait Implementations

impl Clone for DecodeReaderBytesBuilder[src]

impl Debug for DecodeReaderBytesBuilder[src]

impl Default for DecodeReaderBytesBuilder[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.