Publications

Detailed Information

CRFalign: A Sequence-Structure Alignment of Proteins Based on a Combination of HMM-HMM Comparison and Conditional Random Fields

Cited 1 time in Web of Science Cited 1 time in Scopus
Authors

Lee, Sung Jong; Joo, Kee Hyoung; Sim, Sang Jin; Lee, Ju Yong; Lee, In Ho; Lee, Joo Young

Issue Date
2022-06
Publisher
Multidisciplinary Digital Publishing Institute (MDPI)
Citation
Molecules, Vol.27 No.12, p. 3711
Abstract
Sequence-structure alignment for protein sequences is an important task for the template-based modeling of 3D structures of proteins. Building a reliable sequence-structure alignment is a challenging problem, especially for remote homologue target proteins. We built a method of sequence-structure alignment called CRFalign, which improves upon a base alignment model based on HMM-HMM comparison by employing pairwise conditional random fields in combination with nonlinear scoring functions of structural and sequence features. Nonlinear scoring part is implemented by a set of gradient boosted regression trees. In addition to sequence profile features, various position-dependent structural features are employed including secondary structures and solvent accessibilities. Training is performed on reference alignments at superfamily levels or twilight zone chosen from the SABmark benchmark set. We found that CRFalign method produces relative improvement in terms of average alignment accuracies for validation sets of SABmark benchmark. We also tested CRFalign on 51 sequence-structure pairs involving 15 FM target domains of CASP14, where we could see that CRFalign leads to an improvement in average modeling accuracies in these hard targets (TM-CRFalign similar or equal to 42.94%) compared with that of HHalign (TM-HHalign similar or equal to 39.05%) and also that of MRFalign (TM-MRFalign similar or equal to 36.93%). CRFalign was incorporated to our template search framework called CRFpred and was tested for a random target set of 300 target proteins consisting of Easy, Medium and Hard sets which showed a reasonable template search performance.
ISSN
1420-3049
URI
https://hdl.handle.net/10371/201505
DOI
https://doi.org/10.3390/molecules27123711
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • Graduate School of Convergence Science & Technology
  • Dept. of Molecular and Biopharmaceutical Sciences
Research Area AI models for drug discovery, Free energy calculation, Molecular dynamics, 분자동역학, 신약개발을 위한 AI 모델, 자유에너지 계산

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share