idx_parser 0.3.0

Parse IDX files such as the ones used in MNIST database files.
Documentation
  • Coverage
  • 20.69%
    6 out of 29 items documented1 out of 10 items with examples
  • Size
  • Source code size: 14.96 kB This is the summed size of all the files inside the crates.io package for this release.
  • Documentation size: 3.09 MB This is the summed size of all files generated by rustdoc for all configured targets
  • Links
  • j-mcavoy/idx_parser
    0 0 0
  • crates.io
  • Dependencies
  • Versions
  • Owners
  • j-mcavoy

IDX Parser

IDX data file parser written in Rust.

THE IDX FILE FORMAT

the IDX file format is a simple format for vectors and multidimensional matrices of various numerical types.

The basic format is

magic number
size in dimension 0
size in dimension 1
size in dimension 2
.....
size in dimension N
data

The magic number is an integer (MSB first). The first 2 bytes are always 0.

The third byte codes the type of the data:

0x08: unsigned byte
0x09: signed byte
0x0B: short (2 bytes)
0x0C: int (4 bytes)
0x0D: float (4 bytes)
0x0E: double (8 bytes)

The 4-th byte codes the number of dimensions of the vector/matrix: 1 for vectors, 2 for matrices....

The sizes in each dimension are 4-byte integers (MSB first, high endian, like in most non-Intel processors).

The data is stored like in a C array, i.e. the index in the last dimension changes the fastest.

(taken from http://yann.lecun.com/exdb/mnist/ )