Enhancing navigation in biomedical databases by community voting and database-driven text classification. BMC Bioinformatics 2009;10:317.T. Duchrow, T. Shtatland, D. Guettler, M. Pivovarov, S. Kramer, and R. We
The enormous volume of big data poses unique challenges, e.g., in a binary classification problem, the number of instances in the positive class (the class of interest) is miniscule compared to the number of instances in the negative class. This brings up issues such as how to handle the ...
1.5.4Databases for structural classification of proteins Murzin et al. (1995)constructed theSCOPdatabase, which provides a detailed and comprehensive description of the structural and evolutionary relationships of the proteins of known structures. For each protein, the classification has...
Metabolomics using nontargeted tandem mass spectrometry can detect thousands of molecules in a biological sample. However, structural molecule annotation is limited to structures present in libraries or databases, restricting analysis and interpretation of experimental data. Here we describe CANOPUS (class as...
Annotation of KOGs included critical assessment of the annotations available through GenBank, other public databases and the primary literature and additional, in-depth sequence analysis aimed at detection of previously unnoticed homologous relationships. The annotated functions of KOGs were classified into ...
Drug-target interaction prediction: databases, web servers and computational models. Briefings in bioinformatics, https://doi.org/10.1093/bib/bbv066 (2015). Shi, J. Y., Yiu, S. M., Li, Y. M., Leung, H. C. M. & Chin, F. Y. L. Predicting drug-target interaction for new drugs ...
computationally expensive, these algorithms have been shown to be accurate even for short sequences in the current pyrosequencing read length range (80-400 bp). However, the accuracy drops dramatically when phylogenetically close sequences are missing from the search databases. Running CARMA on a ...
BMC Bioinformatics volume 23, Article number: 452 (2022) Cite this article 3103 Accesses 3 Citations 17 Altmetric Metrics details Abstract Background In modern sequencing experiments, quickly and accurately identifying the sources of the reads is a crucial need. In metagenomics, where each read ...
Here we present a collection of curated and easily accessible sequence classification datasets in the field of genomics. The proposed collection is based on a combination of novel datasets constructed from the mining of publicly available databases and existing datasets obtained from published articles. ...
Several databases store and classify plant lncRNAs3,38,39,41. Among these, we wish to highlight the CANTATAdb v2.0 database, which contains 4080 lncRNA genes41. The annotations in CANTATAdb are based on tenA. thalianatranscriptomes and a robust annotation methodology, including identifying lncRNA...