Description
An interesting application of prophyle would be to filter reads based on k-mer hits. To make this more efficient and user-friendly, we should add a separate command for it.
Specification
Usage: prophex filter [options] -k INT <index_prefix> <in1.fq> [in2.fq]
Options:
-k INT length of k-mer
-m FLOAT keep only reads with proportion of kmers >= FLOAT (0.0,1.0] [0.3]
-n INT keep only reads with number of kmers >= INT (alternative to -m)
-o prefix for fastq for passing reads
-f prefix for fastq for filtered reads
-u use k-LCP for querying
-b print sequences and base qualities
-l STR log file name to output statistics
-t INT number of threads [1]
-h print help message
-
If -n is used, -m is ignored.
-
If [in2.fq] is provided, ProPhex will create pref.1.fq and pref.2.fq in case of -o or -f options (pref.fq otherwise). The thresholds are applied on the merged read (while subtracting the N...N separator from counts).
-
Output is in the Kraken-like format. The first column encodes whether read passes C (passes) / U (filtered out).
-
When k-mer blocks are formed, X can be used for unclassified (similarly to A = ambiguous).
Example
prophex filter -k 13 -u -m 0.2 -o passed -f filtered index_prefix in1.fq > output.txt
Description
An interesting application of prophyle would be to filter reads based on k-mer hits. To make this more efficient and user-friendly, we should add a separate command for it.
Specification
If
-nis used,-mis ignored.If
[in2.fq]is provided, ProPhex will createpref.1.fqandpref.2.fqin case of-oor-foptions (pref.fqotherwise). The thresholds are applied on the merged read (while subtracting theN...Nseparator from counts).Output is in the Kraken-like format. The first column encodes whether read passes
C(passes) /U(filtered out).When k-mer blocks are formed,
Xcan be used for unclassified (similarly toA= ambiguous).Example
prophex filter -k 13 -u -m 0.2 -o passed -f filtered index_prefix in1.fq > output.txt