Trait Content

Source

pub trait Content: Debug {
    type Literal<'a>: IntoBuf
       where Self: 'a;

    // Required methods
    fn literal<'a>(&'a self) -> Self::Literal<'a>;
    fn is_escaped(&self) -> bool;
    fn unescaped<'a>(&'a self) -> Unescaped<Self::Literal<'a>>;

    // Provided method
    fn literal_len(&self) -> usize { ... }
}

Expand description

Text content of a JSON token.

Contains the actual textual content of the JSON token read from the JSON text. This is in distinction to Token, which only indicates the type of the token.

For example, consider the following JSON text:

"foo"

The above JSON text contains one token whose type is Token::Str and whose content is "foo".

Required Associated Types§

Source

type Literal<'a>: IntoBuf where Self: 'a

Type that contains the literal string of the token exactly as it appears in the JSON text.

Required Methods§

Source

fn literal<'a>(&'a self) -> Self::Literal<'a>

Returns the literal content of the token exactly as it appears in the JSON text.

§Static content tokens

For token types with a static text content, e.g., Token::NameSep, the value returned is the static content, e.g., :.

§Numbers

For number tokens, the value returned is the literal text of the number token.

§Strings

For string tokens, the value returned is the literal text of the string token including the opening and closing double quote (") characters. Therefore, every string token has length at least two and the unquoted value can be extracted by dropping the first and last characters.

Because the return value contains the entire literal string token as it appears in the JSON text, any escape sequences the string may contain are not expanded. This has the benefit of supporting the following use cases: allowing lexical analyzer implementations to minimize or eliminate allocations when returning token values; and allowing applications to observe or edit a stream of JSON tokens without making any unintended changes to the raw JSON input.

Some applications need to have escape sequences expanded in order to work with normalized strings. For example, it’s pretty hard to reliably do a dictionary lookup for the name "foo" if the literal value might be "fo\u006f", "f\u006f\u006f", "\u0066oo", etc. To check if the string contains an escape sequence, use is_escaped; and to obtain the normalized value with all escape sequences expanded, use unescaped.

§Whitespace

For whitespace tokens, the value returned is the literal string of whitespace characters.

§End of file

For the pseudo-token Token::Eof, the value is the empty string.

Source

fn is_escaped(&self) -> bool

Indicates whether the token content contains escape sequences.

This method must always return false for all token types except Token::Str. For Token::Str, it must return true if the literal text of the string token contains at least one escape sequence, and false otherwise.

Source

fn unescaped<'a>(&'a self) -> Unescaped<Self::Literal<'a>>

Returns a normalized version of literal with all escape sequences in the JSON text fully expanded.

For non-string tokens, and string tokens for which is_escaped returns false, this method returns an Unescaped::Literal containing the same value returned by literal.

For string tokens with one or more escape sequences, this method returns an Unescaped::Expanded containing a normalized version of the string value with the escape sequences expanded. An allocation will be triggered by this expansion, so it may be preferable to cache the value returned rather than calling this method repeatedly on the same content.

As described in the JSON spec, the following escape sequence expansions are done:

Sequence	Expands to
`\"`	Quotation mark, `"`, U+0022
`\\`	Reverse solidus, `\`, U+005c
`\/`	Solidus, `/`, U+002f
`\b`	Backspace, U+0008
`\f`	Form feed, U+000c
`\n`	Line feed, U+000a
`\r`	Carriage return, U+000d
`\t`	Horizontal tab, U+0009
`\uXXXX`	Any Unicode character in basic multilingual plane, U+0000 through U+ffff
`\uHHHH\uLLLL`	Unicode characters outside the basic multilingual plane represented as a high/low surrogate pair