Expand description
§TACO (Trajectory and Compressed Observables) Format
TACO is a custom binary format tailored for molecular dynamics (MD) data that:
- Uses delta encoding for positions, velocities, and forces
- Leverages temporal correlation for inter-frame compression
- Stores data in tensors with optional lossy or lossless compression
- Embeds metadata and simulation parameters using an internal schema
- Supports random frame access and chunked reading
§File Structure
[Header]
- Format version
- Simulation parameters (time step, temperature, etc.)
- Atom metadata (masses, names, etc.)
- Compression settings
[Frame Index Table]
- Byte offsets to each frame for random access
[Data Blocks]
- Chunked frames:
- ΔPosition tensors (Nx3)
- ΔVelocity tensors (Nx3)
- ΔForce tensors (Nx3)
- Box dimensions (if needed)§Key Features
§Delta Encoding
TACO stores differences between consecutive frames to reduce data size.
§Tensor Storage
All data blocks are stored as tensors, enabling SIMD-friendly operations and direct use in GPU/ML pipelines.
§Hybrid Compression
TACO supports both lossless compression for forces and energies, and lossy compression with configurable precision for positions and velocities.
§Smart Chunking
Each chunk contains a configurable number of frames with a mini index and compressed blocks of positions, velocities, and forces. Core library for the TACO format, providing APIs to read, write, and manipulate molecular dynamics trajectory data.
Modules§
- cli
- CLI functionality shared between the binary and Python interface
- python
- Python interface for taco_format using PyO3.
Structs§
- Async
Reader - Async wrapper around the sync Reader
- Async
Writer - Async wrapper around the sync Writer
- Atom
Metadata - Metadata about atoms in the simulation
- Compression
Settings - Settings for compression of trajectory data
- Frame
- A single frame in a molecular dynamics trajectory
- Frame
Data - Coordinate data (positions, velocities, forces) for a frame
- Header
- File format header containing metadata about the trajectory
- Reader
- Reader for TACO trajectory files
- Simulation
Metadata - Metadata about the molecular dynamics simulation
- Writer
- Writer for TACO trajectory files
Enums§
- Error
- Custom error types for TACO operations
- Extra
Array - Enum for storing extra arrays of different dimensions
- Precision
Mode - Precision modes for compression
Constants§
- DEFAULT_
CHUNK_ SIZE - Default chunk size (number of frames per chunk)
- MAGIC
- Magic number used to identify TACO format files
- VERSION
- Version number of the TACO format implementation
Functions§
- compress_
tensor - Compress data using the specified precision mode
- decompress_
tensor - Decompress data
Type Aliases§
- Result
- Result type for TACO operations