Expand description
libradicl
is a crate for reading (and writing) RAD (Reduced Alignment Data) format
files. The RAD format is a binary format designed to encode alignment information
about sequencing reads and how they map to a set of targets (a genome, metagenome,
transcriptome, etc.). The format is “reduced” because it is allowed to contain sparser
information than e.g. a SAM format file.
While the eventual goal of this crate is to provide a generic API to read and write RAD
files that may be designed for any purpose, it is driven mostly by our (the COMBINE-lab’s)
needs within the tools we produce that use the RAD format (e.g. alevin-fry
and
piscem-infer
). Thus, features are generally
developed and added in the order that is most urgent to the development of these tools.
However, we welcome external contributions via pull requests, and are happy to discuss
your potential use cases for the RAD format, and how they might be supported.
This crate is broken into several components that cover the various parts of RAD files including the type tag system, the header, and the main data chunks. The names of each module are fairly self-explanatory.
Modules§
- Types and functions that primarily deal with the reading and writing of data Chunks in the RAD file.
- Constants relevant for the
RAD
format - This module contains custom exit codes that designate specific error conditions.
- This module contains types, functions and traits to deal with RAD file headers, and also top-level functionality to encapsulate a RAD prelude (which consists of the header, and the initial TagSections; basically everything up to the first chunk).
- Free functions to help with reading and writing specific
libradicl
types into and out of primitive types. - This module contains the relevant structures and traits for most of the core RAD types. This includes the integer, numeric and composite types, as well as other relevant types built from them (e.g. TagSections). It also contains the types and traits related to parsing and writing values of specific types.
- This module contains types and traits that provide a high-level iterface to reading and parsing RAD files. Additionally, it provides types that give an interface for parsing RAD chunks in parallel for improved processing performance.
- This module contains types and traits related to RAD records, including the traits for MappedRecords and RecordContexts. It also defines concrete types implementing these traits for
alevin-fry
andpiscem-infer
. - This module contains basic type-related information that doesn’t fit cleanly into the other, more focused modules.
- This module contains some utility constants and functions that are helpful in processing RAD information.
Macros§
- Convert from an underlying newtype (e.g. a crate::libradicl::io::NewU8, crate::libradicl::io::NewU16, crate::libradicl::io::NewU32, crate::libradicl::io::NewU64, crate::libradicl::io::NewU128) into a native u64. Note that conversion from a crate::libradicl::io::NewU128 will panic! as the underlying native type is too narrow to hold the contents of the integer.
- Convert from an underlying newtype (e.g. a crate::libradicl::io::NewU8, crate::libradicl::io::NewU16, crate::libradicl::io::NewU32, crate::libradicl::io::NewU64, crate::libradicl::io::NewU128) into a native u128.
- Try to convert from an underlying newtype (e.g. a crate::libradicl::io::NewU8, crate::libradicl::io::NewU16, crate::libradicl::io::NewU32, crate::libradicl::io::NewU64, crate::libradicl::io::NewU128) into a native u64. If the conversion is successful, we produce an Ok(u64), otherwise we produce an std::result::Result::Err.
- Try to convert from an underlying newtype (e.g. a crate::libradicl::io::NewU8, crate::libradicl::io::NewU16, crate::libradicl::io::NewU32, crate::libradicl::io::NewU64, crate::libradicl::io::NewU128) into a native u128. If the conversion is successful, we produce an Ok(u128), otherwise we produce an std::result::Result::Err.
Structs§
- Represents a temporary bucket of barcodes whose records will be written together and then collated later in memory.
Functions§
- Given a BufReader
<T>
from which to read a set of records that should reside in the same collated bucket, this function will collate the records by cell barcode, filling them into a chunk of memory exactly as they will reside on disk. Ifcompress
is true the collated chunk will be compressed, and then the result will be written to the output guarded byowriter
. - Read an input chunk from
reader
and write the resulting records to the corresponding in-memory bufferslocal_buffers
. As soon as any buffer reachesflush_limit
, flush the buffer by writing it to theoutput_cache
. - Read an input chunk from
reader
and write the resulting records to the corresponding in-memory bufferslocal_buffers
. As soon as any buffer reachesflush_limit
, flush the buffer by writing it to theoutput_cache
.