At least one of the disclosed embodiments describes a computer system that enables efficient strain typing by comparing strain k-mer profiles to generate a strain typing relationship mapping. The system may include one or more processors, and one or more hardware storage devices with stored computer...
The amount of data produced by modern sequencing instruments that needs to be stored is huge. Therefore it is not surprising that a lot of work has been done in the field of specialized data compression of FASTQ files. The existing algorithms are, however, still imperfect and the best tools...
k-mer based assembly evaluation. Contribute to marbl/merqury development by creating an account on GitHub.
(min 3) -k,--kmersize: size of Kmer to use (REQUIRED) -r,--ref: file path to the desired reference file to create VCF (REQUIRED) -m,--min: overwrites the minimum k-mer count to call variant (OPTIONAL, Do not provide a min unless you are sure what you want) -h,--help: ...
We have presented the development, validation, and application of VirFinder, a machine learning based tool that usesk-mer frequencies to accurately identify viral contigs. To our knowledge, VirFinder represents the first virus identification tool that solely uses a nucleotidek-mer frequency based approa...
BioMed CentralBiology DirectWang, D., Xu, J. & Yu, J. KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation. Biol direct 10(1), 1-5 (2015).
RESEARCH ARTICLELarge-scale k-mer-based analysis of theinformational properties of genomes,comparative genomics and taxonomyYuval Bussi ID 1,2,3 , Ruti Kapon ID 1 , Ziv Reich 1 *1 Departmentof Biomolecular Sciences,Weizmann Institute of Science, Rehovot, Israel, 2 Department ofComputer Science and...
For mSourceTracker and FEAST the input matrix corresponds to a taxonomy-based clustering table, whereas decOM takes as input a binary k-mer matrix across metagenomic data sets. Consider the set X={m(1),m(2),m(3),...m(N)}, where X contains all the column vectors of the aforementioned...
Figure 3: K-mer-Based Motif Analysis in Insect Species across Anopheles, Drosophila, and Glossina Genera and Its Application to Species Classification
iMOKA (interactive multi-objective k-mer analysis) is a software that enables comprehensive analysis of sequencing data from large cohorts to generate robust classification models or explore specific genetic elements associated with disease etiology. iMO