pub struct ReadThreadingGraph {
pub reference_path: Vec<NodeIndex>,
/* private fields */
}Expand description
Note: not final but only intended to be subclassed for testing.
Fields§
§reference_path: Vec<NodeIndex>Implementations§
Source§impl ReadThreadingGraph
impl ReadThreadingGraph
pub const ANONYMOUS_SAMPLE: &'static str = "XXX_UNNAMED_XXX"
pub const MAX_CIGAR_COMPLEXITY: usize = 3usize
pub fn new( kmer_size: usize, _debug_graph_transformations: bool, min_base_quality_to_use_in_assembly: u8, min_pruning_samples: usize, min_matching_bases_to_dangling_end_recovery: i32, avx_mode: AVXMode, ) -> Self
pub fn default_with_kmer_size(kmer_size: usize) -> Self
Sourcepub fn determine_non_unique_kmers<'a>(
seq_for_kmers: &SequenceForKmers<'a>,
kmer_size: usize,
) -> Vec<Kmer>
pub fn determine_non_unique_kmers<'a>( seq_for_kmers: &SequenceForKmers<'a>, kmer_size: usize, ) -> Vec<Kmer>
Get the collection of non-unique kmers from sequence for kmer size kmerSize @param seqForKmers a sequence to get kmers from @param kmerSize the size of the kmers @return a non-null collection of non-unique kmers in sequence
Sourcepub fn get_non_uniques(&mut self) -> &HashSet<Kmer>
pub fn get_non_uniques(&mut self) -> &HashSet<Kmer>
Get the set of non-unique kmers in this graph. For debugging purposes @return a non-null set of kmers
Trait Implementations§
Source§impl AbstractReadThreadingGraph for ReadThreadingGraph
impl AbstractReadThreadingGraph for ReadThreadingGraph
Source§fn is_threading_start(&self, kmer: &Kmer) -> bool
fn is_threading_start(&self, kmer: &Kmer) -> bool
Checks whether a kmer can be the threading start based on the current threading start location policy.
@param kmer the query kmer. @return {@code true} if we can start thread the sequence at this kmer, {@code false} otherwise. @see #setThreadingStartOnlyAtExistingVertex(boolean)
Source§fn is_low_quality_graph(&self) -> bool
fn is_low_quality_graph(&self) -> bool
Does the graph not have enough complexity? We define low complexity as a situation where the number of non-unique kmers is more than 20% of the total number of kmers.
@return true if the graph has low complexity, false otherwise
Source§fn preprocess_reads<'a>(
&mut self,
pending: &LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>,
)
fn preprocess_reads<'a>( &mut self, pending: &LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>, )
Since we want to duplicate non-unique kmers in the graph code we must determine what those kmers are
Source§fn add_sequence<'a>(
&self,
pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>,
seq_name: String,
sample_index: usize,
sequence: &'a [u8],
start: usize,
stop: usize,
count: usize,
is_ref: bool,
)
fn add_sequence<'a>( &self, pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>, seq_name: String, sample_index: usize, sequence: &'a [u8], start: usize, stop: usize, count: usize, is_ref: bool, )
Add bases in sequence to this graph
@param seqName a useful seqName for this read, for debugging purposes @param sequence non-null sequence of bases @param start the first base offset in sequence that we should use for constructing the graph using this sequence, inclusive @param stop the last base offset in sequence that we should use for constructing the graph using this sequence, exclusive @param count the representative count of this sequence (to use as the weight) @param isRef is this the reference sequence.
Source§fn add_read<'a>(
&mut self,
read: &'a BirdToolRead,
_sample_names: &[String],
count: &mut usize,
pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>,
)
fn add_read<'a>( &mut self, read: &'a BirdToolRead, _sample_names: &[String], count: &mut usize, pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>, )
Add a read to the sequence graph. Finds maximal consecutive runs of bases with sufficient quality and applies {@see addSequence} to these subreads if they are longer than the kmer size.
@param read a non-null read
Source§fn base_is_usable_for_assembly(&self, base: u8, qual: u8) -> bool
fn base_is_usable_for_assembly(&self, base: u8, qual: u8) -> bool
Determines whether a base can safely be used for assembly. Currently disallows Ns and/or those with low quality
@param base the base under consideration @param qual the quality of that base @return true if the base can be used for assembly, false otherwise
Source§fn build_graph_if_necessary<'a>(
&mut self,
pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>,
)
fn build_graph_if_necessary<'a>( &mut self, pending: &mut LinkedHashMap<usize, Vec<SequenceForKmers<'a>>>, )
Build the read threaded assembly graph if it hasn’t already been constructed from the sequences that have been added to the graph.
Source§fn thread_sequence(&mut self, seq_for_kmers: &SequenceForKmers<'_>)
fn thread_sequence(&mut self, seq_for_kmers: &SequenceForKmers<'_>)
Thread sequence seqForKmers through the current graph, updating the graph as appropriate
@param seqForKmers a non-null sequence
Source§fn find_start(&self, seq_for_kmers: &SequenceForKmers<'_>) -> Option<usize>
fn find_start(&self, seq_for_kmers: &SequenceForKmers<'_>) -> Option<usize>
Find vertex and its position in seqForKmers where we should start assembling seqForKmers
@param seqForKmers the sequence we want to thread into the graph @return the position of the starting vertex in seqForKmer, or -1 if it cannot find one
Source§fn get_or_create_kmer_vertex(
&mut self,
sequence: &[u8],
start: usize,
) -> NodeIndex
fn get_or_create_kmer_vertex( &mut self, sequence: &[u8], start: usize, ) -> NodeIndex
Get the vertex for the kmer in sequence starting at start
@param sequence the sequence @param start the position of the kmer start @return a non-null vertex
Source§fn get_kmer_vertex(
&self,
kmer: &Kmer,
allow_ref_source: bool,
) -> Option<&NodeIndex>
fn get_kmer_vertex( &self, kmer: &Kmer, allow_ref_source: bool, ) -> Option<&NodeIndex>
Get the unique vertex for kmer, or null if not possible.
@param allowRefSource if true, we will allow kmer to match the reference source vertex @return a vertex for kmer, or null (either because it doesn’t exist or is non-unique for graphs that have such a distinction)
Source§fn create_vertex(&mut self, sequence: &[u8], kmer: Kmer) -> NodeIndex
fn create_vertex(&mut self, sequence: &[u8], kmer: Kmer) -> NodeIndex
Create a new vertex for kmer. Add it to the kmerToVertexMap map if appropriate.
@param kmer the kmer we want to create a vertex for @return the non-null created vertex
Source§fn extend_chain_by_one(
&mut self,
prev_vertex: NodeIndex,
sequence: &[u8],
kmer_start: usize,
count: usize,
is_ref: bool,
) -> NodeIndex
fn extend_chain_by_one( &mut self, prev_vertex: NodeIndex, sequence: &[u8], kmer_start: usize, count: usize, is_ref: bool, ) -> NodeIndex
Workhorse routine of the assembler. Given a sequence whose last vertex is anchored in the graph, extend the graph one bp according to the bases in sequence.
@param prevVertex a non-null vertex where sequence was last anchored in the graph @param sequence the sequence we’re threading through the graph @param kmerStart the start of the current kmer in graph we’d like to add @param count the number of observations of this kmer in graph (can be > 1 for GGA) @param isRef is this the reference sequence? @return a non-null vertex connecting prevVertex to in the graph based on sequence
Source§fn recover_dangling_tails(
&mut self,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_end_sw_parameters: &Parameters,
)
fn recover_dangling_tails( &mut self, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_end_sw_parameters: &Parameters, )
Try to recover dangling tails
@param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param minDanglingBranchLength the minimum length of a dangling branch for us to try to merge it @param recoverAll recover even branches with forks @param aligner
Source§fn recover_dangling_heads(
&mut self,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_head_sw_parameters: &Parameters,
)
fn recover_dangling_heads( &mut self, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_head_sw_parameters: &Parameters, )
Try to recover dangling heads
@param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param minDanglingBranchLength the minimum length of a dangling branch for us to try to merge it @param recoverAll recover even branches with forks @param aligner
Source§fn recover_dangling_tail(
&mut self,
vertex: NodeIndex,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_tail_sw_parameters: &Parameters,
) -> usize
fn recover_dangling_tail( &mut self, vertex: NodeIndex, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_tail_sw_parameters: &Parameters, ) -> usize
Attempt to attach vertex with out-degree == 0 to the graph
@param vertex the vertex to recover @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param minDanglingBranchLength the minimum length of a dangling branch for us to try to merge it @param aligner @return 1 if we successfully recovered the vertex and 0 otherwise
Source§fn recover_dangling_head(
&mut self,
vertex: NodeIndex,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_head_sw_parameters: &Parameters,
) -> usize
fn recover_dangling_head( &mut self, vertex: NodeIndex, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_head_sw_parameters: &Parameters, ) -> usize
Attempt to attach vertex with in-degree == 0, or a vertex on its path, to the graph
@param vertex the vertex to recover @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param minDanglingBranchLength the minimum length of a dangling branch for us to try to merge it @param recoverAll recover even branches with forks @param aligner @return 1 if we successfully recovered a vertex and 0 otherwise
Source§fn merge_dangling_tail(
&mut self,
dangling_tail_merge_result: DanglingChainMergeHelper,
) -> usize
fn merge_dangling_tail( &mut self, dangling_tail_merge_result: DanglingChainMergeHelper, ) -> usize
Actually merge the dangling tail if possible
@param danglingTailMergeResult the result from generating a Cigar for the dangling tail against the reference @return 1 if merge was successful, 0 otherwise
Source§fn merge_dangling_head_legacy(
&mut self,
dangling_head_merge_result: DanglingChainMergeHelper,
) -> usize
fn merge_dangling_head_legacy( &mut self, dangling_head_merge_result: DanglingChainMergeHelper, ) -> usize
Actually merge the dangling head if possible, this is the old codepath that does not handle indels
@param danglingHeadMergeResult the result from generating a Cigar for the dangling head against the reference @return 1 if merge was successful, 0 otherwise
Source§fn best_prefix_match_legacy(
&self,
path1: &[u8],
path2: &[u8],
max_index: usize,
) -> Option<usize>
fn best_prefix_match_legacy( &self, path1: &[u8], path2: &[u8], max_index: usize, ) -> Option<usize>
Finds the index of the best extent of the prefix match between the provided paths, for dangling head merging. Assumes that path1.length >= maxIndex and path2.length >= maxIndex.
@param path1 the first path @param path2 the second path @param maxIndex the maximum index to traverse (not inclusive) @return the index of the ideal prefix match or -1 if it cannot find one, must be less than maxIndex
Source§fn get_max_mismatches_legacy(&self, length_of_dangling_branch: usize) -> usize
fn get_max_mismatches_legacy(&self, length_of_dangling_branch: usize) -> usize
NOTE: this method is only used for dangling heads and not tails.
Determine the maximum number of mismatches permitted on the branch. Unless it’s preset (e.g. by unit tests) it should be the length of the branch divided by the kmer size.
@param lengthOfDanglingBranch the length of the branch itself @return positive integer
Source§fn merge_dangling_head(
&mut self,
dangling_head_merge_result: DanglingChainMergeHelper,
) -> usize
fn merge_dangling_head( &mut self, dangling_head_merge_result: DanglingChainMergeHelper, ) -> usize
Actually merge the dangling head if possible
@param danglingHeadMergeResult the result from generating a Cigar for the dangling head against the reference @return 1 if merge was successful, 0 otherwise
Source§fn best_prefix_match(
&self,
cigar_elements: &Vec<Cigar>,
path1: &[u8],
path2: &[u8],
) -> (i64, i64)
fn best_prefix_match( &self, cigar_elements: &Vec<Cigar>, path1: &[u8], path2: &[u8], ) -> (i64, i64)
Finds the index of the best extent of the prefix match between the provided paths, for dangling head merging. Requires that at a minimum there are at least #getMinMatchingBases() matches between the reference and the read at the end in order to emit an alignment offset.
@param cigarElements cigar elements corresponding to the alignment between path1 and path2 @param path1 the first path @param path2 the second path @return an integer pair object where the key is the offset into path1 and the value is offset into path2 (both -1 if no path is found)
Source§fn get_min_matching_bases(&self) -> i32
fn get_min_matching_bases(&self) -> i32
The minimum number of matches to be considered allowable for recovering dangling ends
Source§fn generate_cigar_against_downwards_reference_path(
&self,
vertex: NodeIndex,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_tail_sw_parameters: &Parameters,
) -> Option<DanglingChainMergeHelper>
fn generate_cigar_against_downwards_reference_path( &self, vertex: NodeIndex, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_tail_sw_parameters: &Parameters, ) -> Option<DanglingChainMergeHelper>
Generates the CIGAR string from the Smith-Waterman alignment of the dangling path (where the provided vertex is the sink) and the reference path.
@param aligner @param vertex the sink of the dangling chain @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param recoverAll recover even branches with forks @return a SmithWaterman object which can be null if no proper alignment could be generated
Source§fn generate_cigar_against_upwards_reference_path(
&self,
vertex: NodeIndex,
prune_factor: usize,
min_dangling_branch_length: i32,
recover_all: bool,
dangling_head_sw_parameters: &Parameters,
) -> Option<DanglingChainMergeHelper>
fn generate_cigar_against_upwards_reference_path( &self, vertex: NodeIndex, prune_factor: usize, min_dangling_branch_length: i32, recover_all: bool, dangling_head_sw_parameters: &Parameters, ) -> Option<DanglingChainMergeHelper>
Generates the CIGAR string from the Smith-Waterman alignment of the dangling path (where the provided vertex is the source) and the reference path.
@param aligner @param vertex the source of the dangling head @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param recoverAll recover even branches with forks @return a SmithWaterman object which can be null if no proper alignment could be generated
Source§fn find_path_downwards_to_highest_common_desendant_of_reference(
&self,
vertex: NodeIndex,
prune_factor: usize,
give_up_at_branch: bool,
) -> Option<Vec<NodeIndex>>
fn find_path_downwards_to_highest_common_desendant_of_reference( &self, vertex: NodeIndex, prune_factor: usize, give_up_at_branch: bool, ) -> Option<Vec<NodeIndex>>
Finds the path downwards in the graph from this vertex to the reference sequence, including the highest common descendant vertex. However note that the path is reversed so that this vertex ends up at the end of the path. Also note that nodes are excluded if their pruning weight is less than the pruning factor.
@param vertex the original vertex @param pruneFactor the prune factor to use in ignoring chain pieces @param giveUpAtBranch stop trying to find a path if a vertex with multiple incoming or outgoing edge is found @return the path if it can be determined or null if this vertex either doesn’t merge onto the reference path or has a descendant with multiple outgoing edges before hitting the reference path
Source§fn get_reference_path(
&self,
start: NodeIndex,
direction: TraversalDirection,
blacklisted_edge: Option<EdgeIndex>,
) -> Vec<NodeIndex>
fn get_reference_path( &self, start: NodeIndex, direction: TraversalDirection, blacklisted_edge: Option<EdgeIndex>, ) -> Vec<NodeIndex>
Finds the path in the graph from this vertex to the reference sink, including this vertex
@param start the reference vertex to start from @param direction describes which direction to move in the graph (i.e. down to the reference sink or up to the source) @param blacklistedEdge edge to ignore in the traversal down; useful to exclude the non-reference dangling paths @return the path (non-null, non-empty)
Source§fn find_path_upwards_to_lowest_common_ancestor(
&self,
vertex: NodeIndex,
prune_factor: usize,
give_up_at_branch: bool,
) -> Option<Vec<NodeIndex>>
fn find_path_upwards_to_lowest_common_ancestor( &self, vertex: NodeIndex, prune_factor: usize, give_up_at_branch: bool, ) -> Option<Vec<NodeIndex>>
Finds the path upwards in the graph from this vertex to the first diverging node, including that (lowest common ancestor) vertex. Note that nodes are excluded if their pruning weight is less than the pruning factor.
@param vertex the original vertex @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param giveUpAtBranch stop trying to find a path if a vertex with multiple incoming or outgoing edge is found @return the path if it can be determined or null if this vertex either doesn’t merge onto another path or has an ancestor with multiple incoming edges before hitting the reference path
Source§fn find_path(
&self,
vertex: NodeIndex,
prune_factor: usize,
done: &dyn Fn(NodeIndex) -> bool,
return_path: &dyn Fn(NodeIndex) -> bool,
next_edge: &dyn Fn(NodeIndex) -> EdgeIndex,
next_node: &dyn Fn(EdgeIndex) -> NodeIndex,
) -> Option<Vec<NodeIndex>>
fn find_path( &self, vertex: NodeIndex, prune_factor: usize, done: &dyn Fn(NodeIndex) -> bool, return_path: &dyn Fn(NodeIndex) -> bool, next_edge: &dyn Fn(NodeIndex) -> EdgeIndex, next_node: &dyn Fn(EdgeIndex) -> NodeIndex, ) -> Option<Vec<NodeIndex>>
Finds a path starting from a given vertex and satisfying various predicates
@param vertex the original vertex @param pruneFactor the prune factor to use in ignoring chain pieces if edge multiplicity is < pruneFactor @param done test for whether a vertex is at the end of the path @param returnPath test for whether to return a found path based on its terminal vertex @param nextEdge function on vertices returning the next edge in the path @param nextNode function of edges returning the next vertex in the path @return a path, if one satisfying all predicates is found, {@code null} otherwise
Source§fn get_bases_for_path(
&self,
path: &Vec<NodeIndex>,
expand_source: bool,
) -> String
fn get_bases_for_path( &self, path: &Vec<NodeIndex>, expand_source: bool, ) -> String
The base sequence for the given path.
@param path the list of vertexes that make up the path @param expandSource if true and if we encounter a source node, then expand (and reverse) the character sequence for that node @return non-null sequence of bases corresponding to the given path
fn get_base_graph(&self) -> &BaseGraph<MultiDeBruijnVertex, MultiSampleEdge>
fn get_base_graph_mut( &mut self, ) -> &mut BaseGraph<MultiDeBruijnVertex, MultiSampleEdge>
fn find_kmer(&self, kmer: &Kmer) -> Option<&NodeIndex>
fn set_increase_counts_through_branches( &mut self, increase_counts_through_branches: bool, )
fn set_min_matching_bases_to_dangling_end_recovery( &mut self, min_matching_bases: i32, )
fn has_cycles(&self) -> bool
fn get_next_kmer_vertex_for_chain_extension( &self, kmer: &Kmer, is_ref: bool, _prev_vertex: NodeIndex, ) -> Option<&NodeIndex>
fn should_remove_reads_after_graph_construction(&self) -> bool
Source§fn set_threading_start_only_at_existing_vertex(&mut self, value: bool)
fn set_threading_start_only_at_existing_vertex(&mut self, value: bool)
fn print_graph(&self, file_name: String, _prune_factor: usize)
fn track_kmer(&mut self, kmer: Kmer, new_vertex: NodeIndex)
fn increase_counts_in_matched_kmers( &mut self, seqs_for_kmers: &SequenceForKmers<'_>, vertex: NodeIndex, original_kmer: &[u8], offset: Option<usize>, )
fn extend_dangling_path_against_reference( &mut self, dangling_head_merge_result: &mut DanglingChainMergeHelper, num_nodes_to_extend: usize, ) -> bool
fn has_incident_ref_edge(&self, v: NodeIndex) -> bool
fn get_heaviest_edge(&self, v: NodeIndex, direction: Direction) -> EdgeIndex
fn remove_paths_not_connected_to_ref(&mut self)
fn to_sequence_graph(&self) -> SeqGraph<BaseEdgeStruct>
fn get_kmer_size(&self) -> usize
fn get_reference_source_vertex(&self) -> Option<NodeIndex>
fn get_reference_sink_vertex(&self) -> Option<NodeIndex>
fn post_process_for_haplotype_finding<L: Locatable>( &mut self, _debug_graph_output_path: Option<&String>, _ref_haplotype: &L, )
Source§fn cigar_is_okay_to_merge(
cigar: &CigarString,
require_first_element_m: bool,
require_last_element_m: bool,
) -> bool
fn cigar_is_okay_to_merge( cigar: &CigarString, require_first_element_m: bool, require_last_element_m: bool, ) -> bool
Source§impl Clone for ReadThreadingGraph
impl Clone for ReadThreadingGraph
Source§fn clone(&self) -> ReadThreadingGraph
fn clone(&self) -> ReadThreadingGraph
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read moreAuto Trait Implementations§
impl Freeze for ReadThreadingGraph
impl RefUnwindSafe for ReadThreadingGraph
impl Send for ReadThreadingGraph
impl Sync for ReadThreadingGraph
impl Unpin for ReadThreadingGraph
impl UnwindSafe for ReadThreadingGraph
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> Pointable for T
impl<T> Pointable for T
Source§impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
impl<SS, SP> SupersetOf<SS> for SPwhere
SS: SubsetOf<SP>,
Source§fn to_subset(&self) -> Option<SS>
fn to_subset(&self) -> Option<SS>
self from the equivalent element of its
superset. Read moreSource§fn is_in_subset(&self) -> bool
fn is_in_subset(&self) -> bool
self is actually part of its subset T (and can be converted to it).Source§fn to_subset_unchecked(&self) -> SS
fn to_subset_unchecked(&self) -> SS
self.to_subset but without any property checks. Always succeeds.Source§fn from_subset(element: &SS) -> SP
fn from_subset(element: &SS) -> SP
self to the equivalent element of its superset.