onecode-rs
Rust bindings for ONEcode, a simple and efficient data representation format for genomic data.
Overview
ONEcode is a data representation framework designed primarily for genomic data, providing both human-readable ASCII and compressed binary file versions with strongly typed data.
This library provides safe, idiomatic Rust bindings to the ONEcode C library.
Features
- ✅ Read and write ONE files in both ASCII and binary formats
- ✅ Schema validation and creation
- ✅ Provenance and reference tracking
- ✅ Type-safe access to fields (integers, reals, characters, strings, lists)
- ✅ File navigation and statistics
- ✅ Sequence name extraction from embedded GDB in alignment files
- ✅ RAII-based resource management
- ✅ Fully thread-safe - concurrent operations supported
Requirements
System Dependencies
This library uses bindgen to generate Rust bindings from C headers, which requires clang/libclang:
Ubuntu/Debian:
Fedora/RHEL:
macOS:
Arch Linux:
For more details, see the bindgen requirements documentation.
Installation
Add this to your Cargo.toml:
[]
= { = "https://github.com/pangenome/onecode-rs" }
Usage
Reading a ONE file
use OneFile;
Writing a ONE file
use ;
Creating schemas from text
use OneSchema;
Getting file statistics
use OneFile;
Working with alignment files (.1aln) and sequence names
Alignment files can contain embedded genome database (GDB) information, mapping sequence IDs to names:
use OneFile;
Or look up individual names on-demand:
let mut file = open_read?;
// Get a specific sequence name by ID
if let Some = file.get_sequence_name
API Documentation
Full API documentation is available via cargo doc:
Key types:
OneFile- Main file handle for reading/writing ONE filesOneSchema- Schema definition and validationOneError- Error typesOneType- Field type enumeration
Building
The library uses bindgen to automatically generate bindings from the C headers and cc to compile the C library.
Testing
All tests pass with full concurrent execution:
Test suite includes:
- 9 basic functionality tests
- 3 sequence name extraction tests
- 4 thread-safety stress tests (10-50 concurrent threads)
- 2 doc tests
Thread Safety
✅ Fully thread-safe! The library supports concurrent operations without any restrictions.
The upstream ONEcode C library has been updated with thread-local storage for all global state, making it safe for concurrent use from multiple threads. All operations including schema creation, file reading, and error handling work correctly under concurrent load.
Architecture
The library is organized into several modules:
ffi- Raw FFI bindings generated by bindgenerror- Rust error types and Result wrappertypes- Rust-friendly type definitionsfile- SafeOneFilewrapper with RAII resource managementschema-OneSchemamanagement and validation
Integration with ONEcode
The C library is included as a git subtree in the ONEcode/ directory and compiled automatically during the build process.
To update the ONEcode subtree:
Performance
- Zero-copy access to data where possible
- Supports parallel reading/writing with configurable thread count
- Binary format provides efficient compression
- Thread-safe without synchronization overhead
License
This Rust wrapper is licensed under MIT OR Apache-2.0.
The ONEcode C library has its own license - see ONEcode/ for details.
Contributing
Contributions are welcome! Please ensure tests pass before submitting PRs:
Acknowledgments
ONEcode was developed by Gene Myers and Richard Durbin. This Rust wrapper builds on their excellent work to provide safe, idiomatic Rust bindings.