Struct encoding_rs_io::DecodeReaderBytesBuilder[][src]

pub struct DecodeReaderBytesBuilder { /* fields omitted */ }

A builder for constructing a byte oriented transcoder to UTF-8.

Methods

impl DecodeReaderBytesBuilder
[src]

Create a new decoder builder with a default configuration.

By default, no explicit encoding is used, but if a UTF-8 or UTF-16 BOM is detected, then an appropriate encoding is automatically detected and transcoding is performed (where invalid sequences map to the Unicode replacement codepoint).

Important traits for DecodeReaderBytes<R, B>

Build a new decoder that wraps the given reader.

Build a new decoder that wraps the given reader and uses the given buffer internally for transcoding.

This is useful for cases where it is advantageuous to amortize allocation. Namely, this method permits reusing a buffer for subsequent decoders.

This returns an error if the buffer is smaller than 4 bytes (which is too small to hold maximum size of a single UTF-8 encoded codepoint).

Set an explicit encoding to be used by this decoder.

When an explicit encoding is set, BOM sniffing is disabled and the encoding provided will be used unconditionally. Errors in the encoded bytes are replaced by the Unicode replacement codepoint.

By default, no explicit encoding is set.

Enable UTF-8 passthru, even when a UTF-8 BOM is observed.

When an explicit encoding is not set (thereby invoking automatic encoding detection via BOM sniffing), then a UTF-8 BOM will cause UTF-8 transcoding to occur. In particular, if the source contains invalid UTF-8 sequences, then they are replaced with the Unicode replacement codepoint.

This transcoding may not be desirable. For example, the caller may already have its own UTF-8 handling where invalid UTF-8 is appropriately handled, in which case, doing an extra transcoding step is extra and unnecessary work. Enabling this option will prevent that extra transcoding step from occurring. In this case, the bytes emitted by the reader are passed through unchanged (including the BOM) and the caller will be responsible for handling any invalid UTF-8.

Example

This example demonstrates the effect of enabling this option on data that includes a UTF-8 BOM but also, interestingly enough, subsequently includes invalid UTF-8.

extern crate encoding_rs;
extern crate encoding_rs_io;

use std::error::Error;
use std::io::Read;

use encoding_rs_io::DecodeReaderBytesBuilder;

fn example() -> Result<(), Box<Error>> {
    let source_data = &b"\xEF\xBB\xBFfoo\xFFbar"[..];
    let mut decoder = DecodeReaderBytesBuilder::new()
        .utf8_passthru(true)
        .build(source_data);

    let mut dest = vec![];
    decoder.read_to_end(&mut dest)?;
    // Without the passthru option, you'd get "foo\u{FFFD}bar".
    assert_eq!(dest, b"\xEF\xBB\xBFfoo\xFFbar");
    Ok(())
}

Trait Implementations

impl Clone for DecodeReaderBytesBuilder
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl Debug for DecodeReaderBytesBuilder
[src]

Formats the value using the given formatter. Read more

impl Default for DecodeReaderBytesBuilder
[src]

Returns the "default value" for a type. Read more

Auto Trait Implementations