Browse

Graph pyramids for protein function prediction

DC Field Value Language
dc.contributor.authorSandhan, Tushar-
dc.contributor.authorYoo, Youngjun-
dc.contributor.authorChoi, Jin Young-
dc.contributor.authorKim, Sun-
dc.date.accessioned2017-02-10T02:09:05Z-
dc.date.available2017-03-16T17:00:45Z-
dc.date.issued2015-05-29-
dc.identifier.citationBMC Medical Genomics, 8(Suppl 2):S12ko_KR
dc.identifier.urihttp://hdl.handle.net/10371/100678-
dc.descriptionThis is an Open Access article distributed under the terms of the Creative
Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
ko_KR
dc.description.abstractAbstract

Background
Uncovering the hidden organizational characteristics and regularities among biological sequences is the key issue for detailed understanding of an underlying biological phenomenon. Thus pattern recognition from nucleic acid sequences is an important affair for protein function prediction. As proteins from the same family exhibit similar characteristics, homology based approaches predict protein functions via protein classification. But conventional classification approaches mostly rely on the global features by considering only strong protein similarity matches. This leads to significant loss of prediction accuracy.


Methods
Here we construct the Protein-Protein Similarity (PPS) network, which captures the subtle properties of protein families. The proposed method considers the local as well as the global features, by examining the interactions among 'weakly interacting proteins' in the PPS network and by using hierarchical graph analysis via the graph pyramid. Different underlying properties of the protein families are uncovered by operating the proposed graph based features at various pyramid levels.


Results
Experimental results on benchmark data sets show that the proposed hierarchical voting algorithm using graph pyramid helps to improve computational efficiency as well the protein classification accuracy. Quantitatively, among 14,086 test sequences, on an average the proposed method misclassified only 21.1 sequences whereas baseline BLAST score based global feature matching method misclassified 362.9 sequences. With each correctly classified test sequence, the fast incremental learning ability of the proposed method further enhances the training model. Thus it has achieved more than 96% protein classification accuracy using only 20% per class training data.
ko_KR
dc.language.isoenko_KR
dc.publisherBioMed Centralko_KR
dc.titleGraph pyramids for protein function predictionko_KR
dc.typeArticleko_KR
dc.contributor.AlternativeAuthor유영준-
dc.contributor.AlternativeAuthor최진영-
dc.contributor.AlternativeAuthor김선-
dc.identifier.doi10.1186/1755-8794-8-S2-S12-
dc.language.rfc3066en-
dc.rights.holderSandhan et al.; licensee BioMed Central Ltd.-
dc.date.updated2017-01-06T10:26:18Z-
Appears in Collections:
College of Engineering/Engineering Practice School (공과대학/대학원)Dept. of Computer Science and Engineering (컴퓨터공학부)Journal Papers (저널논문_컴퓨터공학부)
Files in This Item:
  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Browse