Struct Filter

Source

pub struct Filter { /* private fields */ }

Expand description

Filter reads from FASTQ files based on kraken2 classification results.

Extracts reads classified to one or more taxon IDs from FASTQ files, using the kraken2 report (taxonomy tree) and per-read classification output. Supports both single-end and paired-end reads, and writes bgzf-compressed output.

§Required inputs

The command needs three pieces of data that must all come from the same kraken2 run:

--kraken-report (-r): The kraken2 report file containing the taxonomy tree and per-taxon read counts. This is used to resolve taxon IDs, expand descendants, and estimate the expected number of matching reads.
--kraken-output (-k): The per-read classification output from kraken2 (generated with --output). Each line maps a read name to a taxon ID.
--input (-i): One FASTQ file for single-end data, or two for paired-end. Gzip and bgzf compressed inputs are detected and handled automatically.

The kraken output and FASTQ file(s) must contain the same reads in the same order. The command verifies read name agreement and will error if the files are mismatched or have different numbers of records.

§Taxon selection

At least one of --taxon-ids or --include-unclassified must be specified.

--taxon-ids (-t): One or more NCBI taxon IDs to extract. By default, only reads classified directly to these exact taxon IDs are included.
--include-descendants (-d): Expand each taxon ID to include all of its descendants in the taxonomy tree. For example, specifying a genus-level taxon ID with -d will also extract reads classified to any species or strain within that genus.
--include-unclassified (-u): Include reads that kraken2 could not classify (taxon ID 0). Can be combined with --taxon-ids to extract both classified and unclassified reads in a single pass.

§Output

--output (-o): Output FASTQ file path(s). Must provide the same number of output files as input files (one for single-end, two for paired-end). Outputs are always bgzf-compressed regardless of file extension.
--threads: Number of threads used for bgzf compression (default: 4).
--compression-level: Bgzf compression level from 0 (fastest) to 9 (smallest), default 5.

§Examples

Extract all reads classified as E. coli (taxon 562):

k2tools filter -r report.txt -k output.txt -i reads.fq.gz -o ecoli.fq.gz -t 562

Extract all Enterobacteriaceae (taxon 543) including every species and strain beneath it in the taxonomy:

k2tools filter -r report.txt -k output.txt \
    -i reads.fq.gz -o entero.fq.gz -t 543 -d

Extract unclassified reads from a paired-end run:

k2tools filter -r report.txt -k output.txt \
    -i r1.fq.gz r2.fq.gz -o unclass_r1.fq.gz unclass_r2.fq.gz -u

Extract human reads plus unclassified in a single pass:

k2tools filter -r report.txt -k output.txt \
    -i reads.fq.gz -o host_and_unclass.fq.gz -t 9606 -d -u

Filter

Struct Filter Copy item path

§Required inputs

§Taxon selection

§Output

§Examples

Trait Implementations§

impl Args for Filter

fn group_id() -> Option<Id>

fn augment_args<'b>(__clap_app: Command) -> Command

fn augment_args_for_update<'b>(__clap_app: Command) -> Command

impl Command for Filter

fn execute(&self) -> Result<()>

impl FromArgMatches for Filter

fn from_arg_matches(__clap_arg_matches: &ArgMatches) -> Result<Self, Error>

fn from_arg_matches_mut( __clap_arg_matches: &mut ArgMatches, ) -> Result<Self, Error>

fn update_from_arg_matches( &mut self, __clap_arg_matches: &ArgMatches, ) -> Result<(), Error>

fn update_from_arg_matches_mut( &mut self, __clap_arg_matches: &mut ArgMatches, ) -> Result<(), Error>

Auto Trait Implementations§

impl Freeze for Filter

impl RefUnwindSafe for Filter

impl Send for Filter

impl Sync for Filter

impl Unpin for Filter

impl UnsafeUnpin for Filter

impl UnwindSafe for Filter

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Struct Filter

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,