1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
//! Parse and work with Fortran format specifications.
//!
//! # Crate structure
//!
//! Most users should focus on the following four modules:
//!
//! - [`format_specs`]: handles parsing the format strings and provides
//! types to represent each one in Rust. If you just need to get information
//! about a format string, try this module.
//! - [`ser`], [`de`]: handles serializing and deserializing data in Fortran
//! fixed format using the [`serde`] crate. Requires the `serde` feature
//! be activated.
//! - [`dataframes`]: an extension of the `serde` functionality which allows
//! deserializing multiple rows of data into a [`polars`] DataFrame. Requires
//! the `dataframes` feature be activated.
//!
//! Many of the public functions are also available in the crate root.
//!
//! # What are format specifications?
//!
//! Fortran programs can write and read data using fixed format.
//! In fixed format, each piece of data is written to a text or binary
//! file with a specific number of bytes. When reading or writing
//! text, the formatting for each value is given by a format string,
//! which will look something like:
//!
//! ```text
//! (a12,i5,f12.4,e11.5)
//! ```
//!
//! The above string can be interpreted as:
//!
//! - a string with 12 characters (`a12`),
//! - an integer that has 5 characters, including a negative sign if needed (`i5`),
//! - a float with 12 characters (including a decimal point and possibly a negative
//! sign), and 4 digits after the decimal point, and
//! - a float written in engineering/scientific notation (e.g. `6.022E+23`) using 11
//! characters and with 5 digits after the decimal place.
//!
//! What makes such data tricky to read in more modern programming languages is
//! that these fields can and do abut. The following is a perfectly valid string
//! with the above format:
//!
//! ```text
//! Hello,world!123459999999.99996.02214E+23
//! ```
//!
//! The four values are actually `Hello,world!`, `12345`, `9999999.9999`, and `6.02214E+23`,
//! but without knowing the format string, it can be difficult to separate out the
//! values with no delimiting characters.
//!
//! # A brief overview of format specs
//!
//! Being such an old language, it can be difficult to find information about the Fortran format
//! syntax. Here is a short summary of the syntax as implemented in this crate (and tested against
//! gfortran-generated output).
//!
//! ## Basic syntax
//! A format string starts with a `(`, followed by one or more fields separated by
//! commas, and ends with a `)`. (Alternate formats accepted by Fortran are not yet implemented.)
//!
//! A field consists of a least a letter indicating the type to be written and how it is to be formatted.
//! It may be preceded by a number indicating how many times to repeat it, and followed by one or more
//! numbers (usually with a decimal point separator) that indicates its width and precision. A field
//! may also start with a modifier, which impacts all following fields until another modifier of the
//! same type is given.
//!
//! ## Field types
//!
//! - `a` = string/character type. `a` by itself means a single character. Strings are indicated
//! by `aN`, e.g. `a5`, `a12`, or `a128`. The number gives the maximum number of characters in
//! the string.
//! - `i`, `o`, and `z` = integer type. Must have a width, that is, `i5` is valid but `i` alone is not.
//! If a second number is given, as in `i5.3`, the second number indicates how wide the integer must be.
//! If the integer has fewer digits than this, it is zero-padded. For example, `42` formatted as `i5.3`
//! would be written as " 042" - 5 total width, 3 digits required. Fortran does not distinguish between
//! signed and unsigned integers. A negative sign takes up one of the available characters given by the
//! width. `o` and `z` write the integer in octal and hexademical, respectively.
//! - `f`, `e`, `d`, and `g` = real/float type. Must have both a width and precision, i.e. `f8.3` is
//! valid, but neither `f8` nor `f` are. In these types, the number after the decimal (3 in `f8.3`)
//! indicates the number of digits written after the decimal place. `f` will always write out numbers
//! normally while `e` will use scientific/engineering notation (e.g. `6.022E+23`). `d` is similar to
//! `e`, but is intended for 64-bit floats, and represents this with a `D` in place of the `E`: `6.022D+23`.
//! `g` will choose the format based on the magnitude of the value.
//! - `x` = a *positional* specifier. This does not correspond to a value, it merely "positions" the next
//! value. `x` represents a single space, and is the most common positional specifier.
//!
//! ## Repeats and subgroups
//!
//! A given specifier may be repeated by prefixing it with a number. For example, the string `(4i5.3)`
//! indicates that four, 5-character wide integers will be written.
//!
//! Multiple specifiers may be repeated by grouping them with parentheses. For example, the string
//! `(a12,3(i5,e13.5))` means one 12-character string will be written, followed by 3 sets consisting of
//! a 5-character integer and a 13-character float. Fully expanded, this will be:
//!
//! ```text
//! (a12,i5,e13.5,i5,e13.5,i5,e13.5)
//! ```
//!
//! ## Modifiers
//!
//! - `p` = scale a real/float number before writing it. This is always written as `Np`, where `N` is
//! a positive or negative integer. This has slightly different effects for `f` versus `e`/`d` formats.
//! - An `f` format means that the number will be multiplied by 10^N, so given a format `2pf7.3`, the
//! number `3.14` would be written as `314.000`. Conversely, `-2pf7.3` would write it out as ` 0.031`.
//! In both cases, the number of digits after the decimal is unchanged at 3.
//! - For `e` or `d` formats, the decimal place shifts when N > 1, so `2pe9.3` would print 3.14 as `31.40E-01`.
//! For N < 0, it only shifts the place of the numbers, so `-2pe8.3` would print 3.14 as `0.003E+03`.
//! N = 1 is a bit of a special case, in that it shifts the digits to write the value as e.g. `3.140E+00` instead
//! of the default `0.314E+01`.
//! - Note that `f` types actually have their value changed, while `e` and `d` types print the same value, just
//! with different exponents.
//!
//! Modifiers don't just affect the field they are attached to, but affect all releveant fields later in the format
//! string (until the next instance of the same modifier). So in a string like:
//!
//! ```text
//! (f5.3,1pe12.5,e12.5,f6.3,-2pf7.5,f7.5)
//! ```
//!
//! - the first `f5.3` is unaffected,
//! - the next `e12.5,e12.5,f6.3` are all affected by the `1p` modifier, and
//! - the final `f7.5,f7.5` are both affected by the `-2p` modifier.
extern crate pest;
extern crate pest_derive;
pub
pub use ;
pub use FError;
pub use ;
pub use ;
pub use ;
pub use read_to_dataframe;