Skip to main content

Crate taco_format

Crate taco_format 

Source
Expand description

§TACO (Trajectory and Compressed Observables) Format

TACO is a custom binary format tailored for molecular dynamics (MD) data that:

  • Uses delta encoding for positions, velocities, and forces
  • Leverages temporal correlation for inter-frame compression
  • Stores data in tensors with optional lossy or lossless compression
  • Embeds metadata and simulation parameters using an internal schema
  • Supports random frame access and chunked reading

§File Structure

[Header]
- Format version
- Simulation parameters (time step, temperature, etc.)
- Atom metadata (masses, names, etc.)
- Compression settings

[Frame Index Table]
- Byte offsets to each frame for random access

[Data Blocks]
- Chunked frames:
  - ΔPosition tensors (Nx3)
  - ΔVelocity tensors (Nx3)
  - ΔForce tensors (Nx3)
  - Box dimensions (if needed)

§Key Features

§Delta Encoding

TACO stores differences between consecutive frames to reduce data size.

§Tensor Storage

All data blocks are stored as tensors, enabling SIMD-friendly operations and direct use in GPU/ML pipelines.

§Hybrid Compression

TACO supports both lossless compression for forces and energies, and lossy compression with configurable precision for positions and velocities.

§Smart Chunking

Each chunk contains a configurable number of frames with a mini index and compressed blocks of positions, velocities, and forces. Core library for the TACO format, providing APIs to read, write, and manipulate molecular dynamics trajectory data.

Modules§

cli
CLI functionality shared between the binary and Python interface
python
Python interface for taco_format using PyO3.

Structs§

AsyncReader
Async wrapper around the sync Reader
AsyncWriter
Async wrapper around the sync Writer
AtomMetadata
Metadata about atoms in the simulation
CompressionSettings
Settings for compression of trajectory data
Frame
A single frame in a molecular dynamics trajectory
FrameData
Coordinate data (positions, velocities, forces) for a frame
Header
File format header containing metadata about the trajectory
Reader
Reader for TACO trajectory files
SimulationMetadata
Metadata about the molecular dynamics simulation
Writer
Writer for TACO trajectory files

Enums§

Error
Custom error types for TACO operations
ExtraArray
Enum for storing extra arrays of different dimensions
PrecisionMode
Precision modes for compression

Constants§

DEFAULT_CHUNK_SIZE
Default chunk size (number of frames per chunk)
MAGIC
Magic number used to identify TACO format files
VERSION
Version number of the TACO format implementation

Functions§

compress_tensor
Compress data using the specified precision mode
decompress_tensor
Decompress data

Type Aliases§

Result
Result type for TACO operations