In this paper, we present iDNAProt-ES, a DNA-binding protein prediction method that utilizes both sequence based evolutionary and structure based features of proteins to identify their DNA-binding functionality. We used recursive feature elimination to extract an optimal set of features and train ...
Protein-DNA Interactions Prediction of DNA Binding Residues and DNA Binding Proteins The binding of DNA to the protein is very specific where DNA binds to a particular region of the protein usually termed as thebinding site, defined by the group of residues that are essential for thebiological ...
Convolutional neural network architectures for predicting DNA–protein binding CNN用于基因组学研究的最大优势之一是,它可以探测某一motif(指蛋白质分子具有特定功能的或者作为一个独立结构域一部分相近的二级结构聚合体)是否在指定序列窗口内,这种探测能力非常有利于motif的鉴定,进而有助于结合位点的分类 摘要: 我们提...
In addition, these methods are not accurate enough in prediction of the DNA-protein binding sites from the DNA sequence. In this study, we employ the bidirectional long short-term memory (BLSTM) and CNN to capture long-term dependencies between the sequence motifs in DNA, which is called ...
HMMBinder uses monogram and bigram features extracted from the HMM profiles of the protein sequences. To the best of our knowledge, this is the first application of HMM profile based features for the DNA-binding protein prediction problem. We applied Support Vector Machines (SVM) as a ...
Because the zinc fingers of TFIIIA are known to bind to DNA, it is probable that in the case of SWI5 these finger motifs also play an important, but not necessarily exclusive, role in the sequence-specific binding of the protein to DNA. To test this prediction we have expressed the 89-...
Finally, the prediction step can be implemented using a deep learning algorithm that can make use of the extracted features to classify the protein. To sum up, the main contributions of this work can be summarized as follows: The rest of this article is organized as follows: The second ...
Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , ...
To achieve highly accurate protein 3D structure prediction, the original ESM2 is constructed on ∼65 million protein sequences, which are almost 382 times the training set of this study. To fit such a large amount of data, the largest ESM2 model has a parameter count of 15 billion. Altho...
摘要: Increased knowledge of DNA-binding proteins would enhance our understanding of protein functions in cellular biological processes. To handle the explosive growth of protein sequenc... 查看全部>>关键词: DNA-binding protein prediction Random forest Local evolutionary information Machine learning-based...