pub struct GAFBase { /* private fields */ }Expand description
A database connection to a GAF-base database.
This structure stores a database connection and some header information.
In multi-threaded applications, each thread should have its own connection.
A set of alignments overlapping with a subgraph can be extracted using the crate::ReadSet structure.
§Examples
use gbz_base::{GAFBase, GAFBaseParams, GraphReference};
use gbz_base::utils;
use simple_sds::serialize;
let gaf_file = utils::get_test_data("micb-kir3dl1_HG003.gaf.gz");
let gbwt_file = None; // Build a new GBWT index.
let db_file = serialize::temp_file_name("gaf-base");
// Create a database that requires a reference graph to use.
let graph = GraphReference::None;
let params = GAFBaseParams::default();
let db = GAFBase::create_from_files(&gaf_file, gbwt_file, &db_file, graph, ¶ms);
assert!(db.is_ok());
// Now open it and check some statistics.
let db = GAFBase::open(&db_file);
assert!(db.is_ok());
let db = db.unwrap();
assert_eq!(db.nodes(), 2291);
assert_eq!(db.alignments(), 12439);
assert!(!db.bidirectional_gbwt());
drop(db);
let _ = std::fs::remove_file(&db_file);Implementations§
Source§impl GAFBase
Using the database.
impl GAFBase
Using the database.
Sourcepub fn open<P: AsRef<Path>>(filename: P) -> Result<Self, String>
pub fn open<P: AsRef<Path>>(filename: P) -> Result<Self, String>
Opens a connection to the database in the given file.
Reads the header information and passes through any database errors.
Sourcepub fn filename(&self) -> Option<&str>
pub fn filename(&self) -> Option<&str>
Returns the filename of the database or an error if there is no filename.
Sourcepub fn file_size(&self) -> Option<String>
pub fn file_size(&self) -> Option<String>
Returns the size of the database file in a human-readable format.
Sourcepub fn alignments(&self) -> usize
pub fn alignments(&self) -> usize
Returns the number of alignments in the database.
Sourcepub fn blocks(&self) -> usize
pub fn blocks(&self) -> usize
Returns the number of database rows storing the alignments.
Each row corresponds to an AlignmentBlock.
Sourcepub fn bidirectional_gbwt(&self) -> bool
pub fn bidirectional_gbwt(&self) -> bool
Returns true if the paths are stored in a bidirectional GBWT.
Returns all tags stored in the database.
Sourcepub fn graph_name(&self) -> Result<GraphName, String>
pub fn graph_name(&self) -> Result<GraphName, String>
Returns the stable graph name (pggname) for the graph used as the reference for the alignments.
Returns an error if the tags cannot be parsed.
Source§impl GAFBase
Creating the database.
impl GAFBase
Creating the database.
Sourcepub fn create_from_files(
gaf_file: &Path,
gbwt_file: Option<&Path>,
db_file: &Path,
graph: GraphReference<'_, '_>,
params: &GAFBaseParams,
) -> Result<(), String>
pub fn create_from_files( gaf_file: &Path, gbwt_file: Option<&Path>, db_file: &Path, graph: GraphReference<'_, '_>, params: &GAFBaseParams, ) -> Result<(), String>
Creates a new database from the alignments in file gaf_file and stores the database in file db_file.
A unidirectional or bidirectional GBWT index can be provided in file gbwt_file.
Path i in the GBWT index corresponds to line i in the GAF file.
If a GBWT file is not provided, a new unidirectional index will be built in the background.
This will use a significant amount of memory, but it should not be the bottleneck of the construction.
If the parameters indicate that node sequences should be stored in the database, a GBZ-compatible graph must be provided.
§Arguments
gaf_file: GAF file storing the alignments. Can be gzip-compressed.gbwt_file: An optional GBWT file storing the target paths.db_file: Output database file.graph: A GBZ-compatible graph for building a reference-free database, orGraphReference::Nonefor a reference-based one.params: Construction parameters.
§Errors
Returns an error, if:
- The GAF file does not exist.
- The database already exists.
- Trying to build a reference-free GAF-base without a graph.
- The graph is not a valid reference for the alignments.
Passes through any I/O, database, and construction errors.
Sourcepub fn create<P: AsRef<Path>, Q: AsRef<Path>>(
gaf_file: P,
index: Option<Arc<GBWT>>,
db_file: Q,
graph: GraphReference<'_, '_>,
params: &GAFBaseParams,
) -> Result<(), String>
pub fn create<P: AsRef<Path>, Q: AsRef<Path>>( gaf_file: P, index: Option<Arc<GBWT>>, db_file: Q, graph: GraphReference<'_, '_>, params: &GAFBaseParams, ) -> Result<(), String>
Creates a new database from the alignments in file gaf_file and stores the database in file db_file.
A unidirectional or bidirectional GBWT index can be provided.
Path i in the GBWT index corresponds to line i in the GAF file.
If a GBWT index is not provided, a new unidirectional index will be built in the background.
This will use a significant amount of memory, but it should not be the bottleneck of the construction.
If the parameters indicate that node sequences should be stored in the database, a GBZ-compatible graph must be provided.
§Arguments
gaf_file: GAF file storing the alignments. Can be gzip-compressed.index: An optional GBWT index storing the target paths.db_file: Output database file.graph: A GBZ-compatible graph for building a reference-free database, orGraphReference::Nonefor a reference-based one.params: Construction parameters.
§Errors
Returns an error, if:
- The GAF file does not exist.
- The database already exists.
- Trying to build a reference-free GAF-base without a graph.
- The graph is not a valid reference for the alignments.
Passes through any I/O, database, and construction errors.