orc-format 0.3.0

Unofficial implementation of Apache ORC spec in safe Rust
Documentation

Read Apache ORC from Rust

test codecov

Read Apache ORC in Rust.

This repository is similar to parquet2 and Avro-schema, providing a toolkit to:

  • Read ORC files (proto structures)
  • Read stripes (the conversion from proto metadata to memory regions)
  • Decode stripes (the math of decode stripes into e.g. booleans, runs of RLE, etc.)

It currently reads the following (logical) types:

  • booleans
  • strings
  • integers
  • floats

What is not yet implemented:

  • Snappy, LZO decompression
  • RLE v2 Patched Base decoding
  • RLE v1 decoding
  • Utility functions to decode non-native logical types:
    • decimal
    • timestamp
    • struct
    • List
    • Union

Run tests

python3 -m venv venv
venv/bin/pip install -U pip
venv/bin/pip install -U pyorc
venv/bin/python write.py
cargo test