Expand description
Fast lexical conversion routines for a no_std
environment.
lexical-core
is a high-performance library for number-to-string and
string-to-number conversions, without requiring a system
allocator. If you would like to use a library that writes to String
,
look at lexical instead. In addition
to high performance, it’s also highly configurable, supporting nearly
every float and integer format available.
lexical-core
is well-tested, and has been downloaded more than 50 million
times and currently has no known errors in correctness. lexical-core
prioritizes performance above all else, and is competitive or faster
than any other float or integer parser and writer.
In addition, despite having a large number of features, configurability,
and a focus on performance, it also aims to have fast compile times.
Recent versions also add support
for smaller binary sizes, as
well ideal for embedded or web environments, where executable bloat can
be much more detrimental than performance.
§Getting Started
§Parse API
The main parsing API is parse
and parse_partial
. For example,
to parse a number from bytes, validating the entire input is a number:
// String to number using Rust slices.
// The argument is the byte string parsed.
let f: f32 = lexical_core::parse(b"3.5").unwrap(); // 3.5
let i: i32 = lexical_core::parse(b"15").unwrap(); // 15
All lexical-core
parsers are validating, they check the that entire
input data is correct, and stop parsing when invalid data is found,
numerical overflow, or other errors:
let r = lexical_core::parse::<u8>(b"256"); // Err(ErrorCode::Overflow.into())
let r = lexical_core::parse::<u8>(b"1a5"); // Err(ErrorCode::InvalidDigit.into())
For streaming APIs or those incrementally parsing data fed to a parser, where the input data is known to be a float but where the float ends is currently unknown, the partial parsers will both return the data it was able to parse and the number of bytes processed:
let r = lexical_core::parse_partial::<i8>(b"3a5"); // Ok((3, 1))
§Write API
The main parsing API is write
. For example, to write a number to an
existing buffer:
use lexical_core::FormattedSize;
let mut buf = [b'0'; f64::FORMATTED_SIZE];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");
If a buffer of an insufficient size is provided, the writer will panic:
let mut buf = [b'0'; 1];
let digits = lexical_core::write::<i64>(15, &mut buf);
In order to guarantee the buffer is large enough, always ensure there
are at least T::FORMATTED_SIZE_DECIMAL
bytes, which requires the
FormattedSize
trait to be in scope.
use lexical_core::FormattedSize;
let mut buf = [b'0'; f64::FORMATTED_SIZE];
let slc = lexical_core::write::<f64>(15.1, &mut buf);
assert_eq!(slc, b"15.1");
§Conversion API
This writes and parses numbers to and from a format identical to
Rust’s parse
and write
.
write
: Write a number to string.parse
: Parse a number from string validating the complete string is a number.parse_partial
: Parse a number from string returning the number and the number of digits it was able to parse.
use lexical_core::FormattedSize;
// parse
let f: f64 = lexical_core::parse(b"3.5").unwrap();
assert_eq!(f, 3.5);
let (f, count): (f64, usize) = lexical_core::parse_partial(b"3.5").unwrap();
assert_eq!(f, 3.5);
assert_eq!(count, 3);
// write
let mut buffer = [0u8; f64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(f, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("3.5"));
§Options/Formatting API
Each number parser and writer contains extensive formatting control
through options and format
specifications, including digit
separator
support (that is, numbers such as 1_2__3.4_5
), if
integral, fractional, or any significant digits are required, if to
disable parsing or writing of non-finite values, if +
signs are
invalid or required, and much more.
write_with_options
: Write a number to string using custom formatting options.parse_with_options
: Parse a number from string using custom formatting options, validating the complete string is a number.parse_partial_with_options
: Parse a number from string using custom formatting options, returning the number and the number of digits it was able to parse.
Some options, such as custom string representations of non-finite
floats (such as NaN
), are available without the
format
feature. For more comprehensive examples, see the
format
and Comprehensive Configuration sections
below.
use lexical_core::{format, parse_float_options, write_float_options, FormattedSize};
// parse
let f: f64 = lexical_core::parse_with_options::<_, { format::JSON }>(
b"3.5",
&parse_float_options::JSON
).unwrap();
// write
const BUFFER_SIZE: usize = write_float_options::
JSON.buffer_size_const::<f64, { format::JSON }>();
let mut buffer = [0u8; BUFFER_SIZE];
let digits = lexical_core::write_with_options::<_, { format::JSON }>(
f,
&mut buffer,
&write_float_options::JSON
);
assert_eq!(str::from_utf8(digits), Ok("3.5"));
§Features
In accordance with the Rust ethos, all features are additive: the crate
may be build with --all-features
without issue. The following features
are enabled by default:
write-integers
(Default) - Enable writing of integers.write-floats
(Default) - Enable writing of floats.parse-integers
(Default) - Enable parsing of integers.parse-floats
(Default) - Enable parsing of floats.power-of-two
- Add support for writing power-of-two number strings.radix
- Add support for strings of any radix.compact
- Reduce code size at the cost of performance.format
- Add support for custom number formatting.f16
- Enable support for half-precisionf16
andbf16
floats.std
(Default) - Disable to allow use in ano_std
environment.
A complete description of supported features includes:
§write-integers
Enable support for writing integers to string.
use lexical_core::FormattedSize;
let mut buffer = [0u8; i64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(1234, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("1234"));
§write-floats
Enable support for writing floating-point numbers to string.
use lexical_core::FormattedSize;
let mut buffer = [0u8; f64::FORMATTED_SIZE_DECIMAL];
let digits = lexical_core::write(1.234, &mut buffer);
assert_eq!(str::from_utf8(digits), Ok("1.234"));
§parse-integers
Enable support for parsing integers from string.
let f: i64 = lexical_core::parse(b"1234").unwrap();
assert_eq!(f, 1234);
§parsing-floats
Enable support for parsing floating-point numbers from string.
let f: f64 = lexical_core::parse(b"1.234").unwrap();
assert_eq!(f, 1.234);
§format
Adds support for the entire format API. This allows extensive configurability for parsing and writing numbers in custom formats, with different valid syntax requirements.
§JSON
For example, in JSON, the following floats are valid or invalid:
-1 // valid
+1 // invalid
1 // valid
1. // invalid
.1 // invalid
0.1 // valid
nan // invalid
inf // invalid
Infinity // invalid
All of the finite numbers are valid in Rust, and Rust supports non-finite
floats. In order to parse standard-conforming JSON floats using
lexical-core
, you may use the following approach:
use lexical_core::{format, parse_float_options, parse_with_options, Result};
fn parse_json_float<Bytes: AsRef<[u8]>>(bytes: Bytes) -> Result<f64> {
parse_with_options::<_, { format::JSON }>(bytes.as_ref(), &parse_float_options::JSON)
}
Enabling the format
API significantly increases compile
times, however, it enables a large amount of customization in how floats are
written.
§power-of-two
Enable doing numeric conversions to and from strings radixes that are powers
of two, that is, 2
, 4
, 8
, 16
, and 32
. This avoids most of the
overhead and binary bloat of the radix
feature, while enabling
support for the most commonly-used radixes.
use lexical_core::{
ParseFloatOptions,
WriteFloatOptions,
FormattedSize,
NumberFormatBuilder
};
// parse
const BINARY: u128 = NumberFormatBuilder::binary();
let value = "1.0011101111100111011011001000101101000011100101011";
let f: f64 = lexical_core::parse_with_options::<_, { BINARY }>(
value.as_bytes(),
&ParseFloatOptions::new()
).unwrap();
// write
let mut buffer = [0u8; f64::FORMATTED_SIZE];
let digits = lexical_core::write_with_options::<_, { BINARY }>(
f,
&mut buffer,
&WriteFloatOptions::new()
);
assert_eq!(str::from_utf8(digits), Ok(value));
§radix
Enable doing numeric conversions to and from strings for all radixes.
This requires more static storage than power-of-two
,
and increases compile times, but can be quite useful
for esoteric programming languages which use duodecimal floats, for
example.
use lexical_core::{
ParseFloatOptions,
WriteFloatOptions,
FormattedSize,
NumberFormatBuilder
};
// parse
const FORMAT: u128 = NumberFormatBuilder::from_radix(12);
let value = "1.29842830A44BAA2";
let f: f64 = lexical_core::parse_with_options::<_, { FORMAT }>(
value.as_bytes(),
&ParseFloatOptions::new()
).unwrap();
// write
let mut buffer = [0u8; f64::FORMATTED_SIZE];
let digits = lexical_core::write_with_options::<_, { FORMAT }>(
f,
&mut buffer,
&WriteFloatOptions::new()
);
assert_eq!(str::from_utf8(digits), Ok(value));
§compact
Reduce the generated code size at the cost of performance. This minimizes the number of static tables, inlining, and generics used, drastically reducing the size of the generated binaries.
§std
Enable use of the standard library. Currently, the standard library is not used, and may be disabled without any change in functionality on stable.
§Comprehensive Configuration
lexical-core
provides two main levels of configuration:
- The
NumberFormatBuilder
, creating a packed struct with custom formatting options. - The Options API.
§Number Format
The number format class provides numerous flags to specify number parsing or
writing. When the power-of-two
feature is
enabled, additional flags are added:
- The radix for the significant digits (default
10
). - The radix for the exponent base (default
10
). - The radix for the exponent digits (default
10
).
When the format
feature is enabled, numerous other syntax and
digit separator flags are enabled, including:
- A digit separator character, to group digits for increased legibility.
- Whether leading, trailing, internal, and consecutive digit separators are allowed.
- Toggling required float components, such as digits before the decimal point.
- Toggling whether special floats are allowed or are case-sensitive.
Many pre-defined constants therefore exist to simplify common use-cases, including:
JSON
,XML
,TOML
,YAML
,SQLite
, and many more.Rust
,Python
,C#
,FORTRAN
,COBOL
literals and strings, and many more.
For a list of all supported fields, see Fields.
§Options API
The Options API provides high-level options to specify number parsing or writing, options not intrinsically tied to a number format. For example, the Options API provides:
- The
exponent
character (defaults tob'e'
orb'^'
, depending on the radix). - The
decimal point
character (defaults tob'.'
). - Custom
NaN
andInfinity
stringrepresentations
. - Whether to
trim
the fraction component from integral floats. - The exponent
break-point
for scientific notation. - The
maximum
andminimum
number of significant digits to write. - The rounding
mode
when truncating significant digits while writing.
The available options are:
In addition, pre-defined constants for each category of options may
be found in their respective modules, for example, JSON
.
§Examples
An example of creating your own options to parse European-style numbers (which use commas as decimal points, and periods as digit separators) is as follows:
// This creates a format to parse a European-style float number.
// The decimal point is a comma, and the digit separators (optional)
// are periods.
const EUROPEAN: u128 = lexical_core::NumberFormatBuilder::new()
.digit_separator(num::NonZeroU8::new(b'.'))
.build_strict();
const COMMA_OPTIONS: lexical_core::ParseFloatOptions = lexical_core::ParseFloatOptions::builder()
.decimal_point(b',')
.build_strict();
assert_eq!(
lexical_core::parse_with_options::<f32, EUROPEAN>(b"300,10", &COMMA_OPTIONS),
Ok(300.10)
);
// Another example, using a pre-defined constant for JSON.
const JSON: u128 = lexical_core::format::JSON;
const JSON_OPTIONS: lexical_core::ParseFloatOptions = lexical_core::ParseFloatOptions::new();
assert_eq!(
lexical_core::parse_with_options::<f32, JSON>(b"0e1", &JSON_OPTIONS),
Ok(0.0)
);
assert_eq!(
lexical_core::parse_with_options::<f32, JSON>(b"1E+2", &JSON_OPTIONS),
Ok(100.0)
);
§Version Support
The minimum, standard, required version is 1.63.0
, for
const generic support. Older versions of lexical support older Rust
versions.
§Algorithms
§Benchmarks
A comprehensive analysis of lexical commits and their performance can be found in benchmarks.
§Design
§Safety Guarantees
There is no non-trivial unsafe behavior in lexical-core
itself,
however, any incorrect safety invariants in our parsers and writers
(lexical-parse-float
, lexical-parse-integer
,
lexical-write-float
, and lexical-write-integer
) could cause those
safety invariants to be broken.
Modules§
- format
- The creation and processing of number format packed structs.
- parse_
float_ options parse-floats
- Configuration options for parsing floats.
- parse_
integer_ options parse-integers
- Configuration options for parsing integers.
- write_
float_ options write-floats
- Configuration options for writing floats.
- write_
integer_ options write-integers
- Configuration options for writing integers.
Structs§
- Number
Format - Helper to access features from the packed format struct.
- Number
Format Builder - Validating builder for
NumberFormat
from the provided specifications. - Parse
Float Options parse-floats
- Options to customize parsing floats.
- Parse
Float Options Builder parse-floats
- Builder for
Options
. - Parse
Integer Options parse-integers
- Options to customize the parsing integers.
- Parse
Integer Options Builder parse-integers
- Builder for
Options
. - Write
Float Options write-floats
- Options to customize writing floats.
- Write
Float Options Builder write-floats
- Builder for
Options
. - Write
Integer Options write-integers
- Immutable options to customize writing integers.
- Write
Integer Options Builder write-integers
- Builder for
Options
. - bf16
f16
- A 16-bit floating point type implementing the
bfloat16
format. - f16
f16
- A 16-bit floating point type implementing the IEEE 754-2008 standard
binary16
a.k.a “half” format.
Enums§
- Error
- Error code during parsing, indicating failure type.
Constants§
- BUFFER_
SIZE write-floats
orwrite-integers
- Maximum number of bytes required to serialize any number with default options to string.
Traits§
- Formatted
Size write-floats
orwrite-integers
- The size, in bytes, of formatted values.
- From
Lexical parse-floats
orparse-integers
- Trait for numerical types that can be parsed from bytes.
- From
Lexical With Options parse-floats
orparse-integers
- Trait for numerical types that can be parsed from bytes with custom options.
- Parse
Options parse-floats
orparse-integers
- Shared trait for all parser options.
- ToLexical
write-floats
orwrite-integers
- Trait for numerical types that can be serialized to bytes.
- ToLexical
With Options write-floats
orwrite-integers
- Trait for numerical types that can be serialized to bytes with custom options.
- Write
Options write-floats
orwrite-integers
- Shared trait for all writer options.
Functions§
- format_
error - Get the error type from the format packed struct.
- format_
is_ valid - Determine if the format packed struct is valid.
- parse
parse-floats
orparse-integers
- Parse complete number from string.
- parse_
partial parse-floats
orparse-integers
- Parse partial number from string.
- parse_
partial_ with_ options parse-floats
orparse-integers
- Parse partial number from string with custom parsing options.
- parse_
with_ options parse-floats
orparse-integers
- Parse complete number from string with custom parsing options.
- write
write-floats
orwrite-integers
- Write number to string.
- write_
with_ options write-floats
orwrite-integers
- Write number to string with custom options.