Crate lexical_core

Crate lexical_core 

Source
Expand description

Fast lexical conversion routines for a no_std environment.

lexical-core is a high-performance library for number-to-string and string-to-number conversions, without requiring a system allocator. If you would like to use a library that writes to String, look at lexical instead. In addition to high performance, it’s also highly configurable, supporting nearly every float and integer format available.

lexical-core is well-tested, and has been downloaded more than 50 million times and currently has no known errors in correctness. lexical-core prioritizes performance above all else, and is competitive or faster than any other float or integer parser and writer.

In addition, despite having a large number of features, configurability, and a focus on performance, it also aims to have fast compile times. Recent versions also add support for smaller binary sizes, as well ideal for embedded or web environments, where executable bloat can be much more detrimental than performance.

§Getting Started

§Parse API

The main parsing API is parse and parse_partial. For example, to parse a number from bytes, validating the entire input is a number:

// String to number using Rust slices.
// The argument is the byte string parsed.
let f: f32 = lexical_core::parse(b"3.5").unwrap();   // 3.5
let i: i32 = lexical_core::parse(b"15").unwrap();    // 15

All lexical-core parsers are validating, they check the that entire input data is correct, and stop parsing when invalid data is found, numerical overflow, or other errors:

let r = lexical_core::parse::<u8>(b"256"); // Err(ErrorCode::Overflow.into())
let r = lexical_core::parse::<u8>(b"1a5"); // Err(ErrorCode::InvalidDigit.into())

For streaming APIs or those incrementally parsing data fed to a parser, where the input data is known to be a float but where the float ends is currently unknown, the partial parsers will both return the data it was able to parse and the number of bytes processed:

let r = lexical_core::parse_partial::<i8>(b"3a5"); // Ok((3, 1))
§Write API

The main parsing API is write. For example, to write a number to an existing buffer:

use lexical_core::FormattedSize;

let mut buf = [b'0'; f64::FORMATTED_SIZE];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");

If a buffer of an insufficient size is provided, the writer will panic:

let mut buf = [b'0'; 1];
let digits = lexical_core::write::<i64>(15, &mut buf);

In order to guarantee the buffer is large enough, always ensure there are at least T::FORMATTED_SIZE_DECIMAL bytes, which requires the FormattedSize trait to be in scope.

use lexical_core::FormattedSize;

let mut buf = [b'0'; f64::FORMATTED_SIZE];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");

§Conversion API

This writes and parses numbers to and from a format identical to Rust’s parse and write.

  • write: Write a number to string.
  • parse: Parse a number from string validating the complete string is a number.
  • parse_partial: Parse a number from string returning the number and the number of digits it was able to parse.
use lexical_core::FormattedSize;

// parse
let f: f64 = lexical_core::parse(b"3.5").unwrap();
assert_eq!(f, 3.5);

let (f, count): (f64, usize) = lexical_core::parse_partial(b"3.5").unwrap();
assert_eq!(f, 3.5);
assert_eq!(count, 3);

// write
let mut buffer = [0u8; f64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(f, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("3.5"));

§Options/Formatting API

Each number parser and writer contains extensive formatting control through options and format specifications, including digit separator support (that is, numbers such as 1_2__3.4_5), if integral, fractional, or any significant digits are required, if to disable parsing or writing of non-finite values, if + signs are invalid or required, and much more.

  • write_with_options: Write a number to string using custom formatting options.
  • parse_with_options: Parse a number from string using custom formatting options, validating the complete string is a number.
  • parse_partial_with_options: Parse a number from string using custom formatting options, returning the number and the number of digits it was able to parse.

Some options, such as custom string representations of non-finite floats (such as NaN), are available without the format feature. For more comprehensive examples, see the format and Comprehensive Configuration sections below.

use lexical_core::{format, parse_float_options, write_float_options, FormattedSize};

// parse
let f: f64 = lexical_core::parse_with_options::<_, { format::JSON }>(
    b"3.5",
    &parse_float_options::JSON
).unwrap();

// write
const BUFFER_SIZE: usize = write_float_options::
    JSON.buffer_size_const::<f64, { format::JSON }>();
let mut buffer = [0u8; BUFFER_SIZE];
let digits = lexical_core::write_with_options::<_, { format::JSON }>(
    f,
    &mut buffer,
    &write_float_options::JSON
);
assert_eq!(str::from_utf8(digits), Ok("3.5"));

§Features

In accordance with the Rust ethos, all features are additive: the crate may be build with --all-features without issue. The following features are enabled by default:

  • write-integers (Default) - Enable writing of integers.
  • write-floats (Default) - Enable writing of floats.
  • parse-integers (Default) - Enable parsing of integers.
  • parse-floats (Default) - Enable parsing of floats.
  • power-of-two - Add support for writing power-of-two number strings.
  • radix - Add support for strings of any radix.
  • compact - Reduce code size at the cost of performance.
  • format - Add support for custom number formatting.
  • f16 - Enable support for half-precision f16 and bf16 floats.
  • std (Default) - Disable to allow use in a no_std environment.

A complete description of supported features includes:

§write-integers

Enable support for writing integers to string.

use lexical_core::FormattedSize;

let mut buffer = [0u8; i64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(1234, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("1234"));
§write-floats

Enable support for writing floating-point numbers to string.

use lexical_core::FormattedSize;

let mut buffer = [0u8; f64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(1.234, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("1.234"));
§parse-integers

Enable support for parsing integers from string.

let f: i64 = lexical_core::parse(b"1234").unwrap();
assert_eq!(f, 1234);
§parsing-floats

Enable support for parsing floating-point numbers from string.

let f: f64 = lexical_core::parse(b"1.234").unwrap();
assert_eq!(f, 1.234);
§format

Adds support for the entire format API. This allows extensive configurability for parsing and writing numbers in custom formats, with different valid syntax requirements.

§JSON

For example, in JSON, the following floats are valid or invalid:

-1          // valid
+1          // invalid
1           // valid
1.          // invalid
.1          // invalid
0.1         // valid
nan         // invalid
inf         // invalid
Infinity    // invalid

All of the finite numbers are valid in Rust, and Rust supports non-finite floats. In order to parse standard-conforming JSON floats using lexical-core, you may use the following approach:

use lexical_core::{format, parse_float_options, parse_with_options, Result};

fn parse_json_float<Bytes: AsRef<[u8]>>(bytes: Bytes) -> Result<f64> {
    parse_with_options::<_, { format::JSON }>(bytes.as_ref(), &parse_float_options::JSON)
}

Enabling the format API significantly increases compile times, however, it enables a large amount of customization in how floats are written.

§power-of-two

Enable doing numeric conversions to and from strings radixes that are powers of two, that is, 2, 4, 8, 16, and 32. This avoids most of the overhead and binary bloat of the radix feature, while enabling support for the most commonly-used radixes.

use lexical_core::{
    ParseFloatOptions,
    WriteFloatOptions,
    FormattedSize,
    NumberFormatBuilder
};

// parse
const BINARY: u128 = NumberFormatBuilder::binary();
let value = "1.0011101111100111011011001000101101000011100101011";
let f: f64 = lexical_core::parse_with_options::<_, { BINARY }>(
    value.as_bytes(),
    &ParseFloatOptions::new()
).unwrap();

// write
let mut buffer = [0u8; f64::FORMATTED_SIZE];
let digits = lexical_core::write_with_options::<_, { BINARY }>(
    f,
    &mut buffer,
    &WriteFloatOptions::new()
);
assert_eq!(str::from_utf8(digits), Ok(value));
§radix

Enable doing numeric conversions to and from strings for all radixes. This requires more static storage than power-of-two, and increases compile times, but can be quite useful for esoteric programming languages which use duodecimal floats, for example.

use lexical_core::{
    ParseFloatOptions,
    WriteFloatOptions,
    FormattedSize,
    NumberFormatBuilder
};

// parse
const FORMAT: u128 = NumberFormatBuilder::from_radix(12);
let value = "1.29842830A44BAA2";
let f: f64 = lexical_core::parse_with_options::<_, { FORMAT }>(
    value.as_bytes(),
    &ParseFloatOptions::new()
).unwrap();

// write
let mut buffer = [0u8; f64::FORMATTED_SIZE];
let digits = lexical_core::write_with_options::<_, { FORMAT }>(
    f,
    &mut buffer,
    &WriteFloatOptions::new()
);
assert_eq!(str::from_utf8(digits), Ok(value));
§compact

Reduce the generated code size at the cost of performance. This minimizes the number of static tables, inlining, and generics used, drastically reducing the size of the generated binaries.

§std

Enable use of the standard library. Currently, the standard library is not used, and may be disabled without any change in functionality on stable.

§Comprehensive Configuration

lexical-core provides two main levels of configuration:

  • The NumberFormatBuilder, creating a packed struct with custom formatting options.
  • The Options API.

§Number Format

The number format class provides numerous flags to specify number parsing or writing. When the power-of-two feature is enabled, additional flags are added:

  • The radix for the significant digits (default 10).
  • The radix for the exponent base (default 10).
  • The radix for the exponent digits (default 10).

When the format feature is enabled, numerous other syntax and digit separator flags are enabled, including:

  • A digit separator character, to group digits for increased legibility.
  • Whether leading, trailing, internal, and consecutive digit separators are allowed.
  • Toggling required float components, such as digits before the decimal point.
  • Toggling whether special floats are allowed or are case-sensitive.

Many pre-defined constants therefore exist to simplify common use-cases, including:

For a list of all supported fields, see Fields.

§Options API

The Options API provides high-level options to specify number parsing or writing, options not intrinsically tied to a number format. For example, the Options API provides:

  • The exponent character (defaults to b'e' or b'^', depending on the radix).
  • The decimal point character (defaults to b'.').
  • Custom NaN and Infinity string representations.
  • Whether to trim the fraction component from integral floats.
  • The exponent break-point for scientific notation.
  • The maximum and minimum number of significant digits to write.
  • The rounding mode when truncating significant digits while writing.

The available options are:

In addition, pre-defined constants for each category of options may be found in their respective modules, for example, JSON.

§Examples

An example of creating your own options to parse European-style numbers (which use commas as decimal points, and periods as digit separators) is as follows:

// This creates a format to parse a European-style float number.
// The decimal point is a comma, and the digit separators (optional)
// are periods.
const EUROPEAN: u128 = lexical_core::NumberFormatBuilder::new()
    .digit_separator(num::NonZeroU8::new(b'.'))
    .build_strict();
const COMMA_OPTIONS: lexical_core::ParseFloatOptions = lexical_core::ParseFloatOptions::builder()
    .decimal_point(b',')
    .build_strict();
assert_eq!(
    lexical_core::parse_with_options::<f32, EUROPEAN>(b"300,10", &COMMA_OPTIONS),
    Ok(300.10)
);

// Another example, using a pre-defined constant for JSON.
const JSON: u128 = lexical_core::format::JSON;
const JSON_OPTIONS: lexical_core::ParseFloatOptions = lexical_core::ParseFloatOptions::new();
assert_eq!(
    lexical_core::parse_with_options::<f32, JSON>(b"0e1", &JSON_OPTIONS),
    Ok(0.0)
);
assert_eq!(
    lexical_core::parse_with_options::<f32, JSON>(b"1E+2", &JSON_OPTIONS),
    Ok(100.0)
);

§Version Support

The minimum, standard, required version is 1.63.0, for const generic support. Older versions of lexical support older Rust versions.

§Algorithms

§Benchmarks

A comprehensive analysis of lexical commits and their performance can be found in benchmarks.

§Design

§Safety Guarantees

There is no non-trivial unsafe behavior in lexical-core itself, however, any incorrect safety invariants in our parsers and writers (lexical-parse-float, lexical-parse-integer, lexical-write-float, and lexical-write-integer) could cause those safety invariants to be broken.

Modules§

format
The creation and processing of number format packed structs.
parse_float_optionsparse-floats
Configuration options for parsing floats.
parse_integer_optionsparse-integers
Configuration options for parsing integers.
write_float_optionswrite-floats
Configuration options for writing floats.
write_integer_optionswrite-integers
Configuration options for writing integers.

Structs§

NumberFormat
Helper to access features from the packed format struct.
NumberFormatBuilder
Validating builder for NumberFormat from the provided specifications.
ParseFloatOptionsparse-floats
Options to customize parsing floats.
ParseFloatOptionsBuilderparse-floats
Builder for Options.
ParseIntegerOptionsparse-integers
Options to customize the parsing integers.
ParseIntegerOptionsBuilderparse-integers
Builder for Options.
WriteFloatOptionswrite-floats
Options to customize writing floats.
WriteFloatOptionsBuilderwrite-floats
Builder for Options.
WriteIntegerOptionswrite-integers
Immutable options to customize writing integers.
WriteIntegerOptionsBuilderwrite-integers
Builder for Options.
bf16f16
A 16-bit floating point type implementing the bfloat16 format.
f16f16
A 16-bit floating point type implementing the IEEE 754-2008 standard binary16 a.k.a “half” format.

Enums§

Error
Error code during parsing, indicating failure type.

Constants§

BUFFER_SIZEwrite-floats or write-integers
Maximum number of bytes required to serialize any number with default options to string.

Traits§

FormattedSizewrite-floats or write-integers
The size, in bytes, of formatted values.
FromLexicalparse-floats or parse-integers
Trait for numerical types that can be parsed from bytes.
FromLexicalWithOptionsparse-floats or parse-integers
Trait for numerical types that can be parsed from bytes with custom options.
ParseOptionsparse-floats or parse-integers
Shared trait for all parser options.
ToLexicalwrite-floats or write-integers
Trait for numerical types that can be serialized to bytes.
ToLexicalWithOptionswrite-floats or write-integers
Trait for numerical types that can be serialized to bytes with custom options.
WriteOptionswrite-floats or write-integers
Shared trait for all writer options.

Functions§

format_error
Get the error type from the format packed struct.
format_is_valid
Determine if the format packed struct is valid.
parseparse-floats or parse-integers
Parse complete number from string.
parse_partialparse-floats or parse-integers
Parse partial number from string.
parse_partial_with_optionsparse-floats or parse-integers
Parse partial number from string with custom parsing options.
parse_with_optionsparse-floats or parse-integers
Parse complete number from string with custom parsing options.
writewrite-floats or write-integers
Write number to string.
write_with_optionswrite-floats or write-integers
Write number to string with custom options.

Type Aliases§

Result
A specialized Result type for lexical operations.