Publications
Detailed Information
NASCUP: Nucleic Acid Sequence Classification by Universal Probability
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kwon, Sunyoung | - |
dc.contributor.author | Kim, Gyuwan | - |
dc.contributor.author | Lee, Byunghan | - |
dc.contributor.author | Chun, Jongsik | - |
dc.contributor.author | Yoon, Sung Roh | - |
dc.contributor.author | Kim, Young-Han | - |
dc.date.accessioned | 2022-08-22T09:09:55Z | - |
dc.date.available | 2022-08-22T09:09:55Z | - |
dc.date.created | 2022-07-08 | - |
dc.date.created | 2022-07-08 | - |
dc.date.issued | 2021-11 | - |
dc.identifier.citation | IEEE Access, Vol.9, pp.162779-162791 | - |
dc.identifier.issn | 2169-3536 | - |
dc.identifier.uri | https://hdl.handle.net/10371/184333 | - |
dc.description.abstract | Nucleic acid sequence classification is a fundamental task in the field of bioinformatics. Due to the increasing amount of unlabeled nucleotide sequences, fast and accurate classification of them on a large scale has become crucial. In this work, we developed NASCUP, a new classification method that captures statistical structures of nucleotide sequences by compact context-tree models and universal probability from information theory. A comprehensive experimental study involving nine public databases for functional non-coding RNA, microbial taxonomy and coding/non-coding RNA classification demonstrates the advantages of NASCUP over widely-used alternatives in efficiency, accuracy, and scalability across all datasets considered. NASCUP achieved BLAST-like classification accuracy consistently for several large-scale databases in orders-of-magnitude reduced runtime, and was applied to other bioinformatics tasks such as outlier detection and synthetic sequence generation. | - |
dc.language | 영어 | - |
dc.publisher | Institute of Electrical and Electronics Engineers Inc. | - |
dc.title | NASCUP: Nucleic Acid Sequence Classification by Universal Probability | - |
dc.type | Article | - |
dc.citation.journaltitle | IEEE Access | - |
dc.identifier.wosid | 000730449500001 | - |
dc.identifier.scopusid | 2-s2.0-85119427713 | - |
dc.citation.endpage | 162791 | - |
dc.citation.startpage | 162779 | - |
dc.citation.volume | 9 | - |
dc.description.isOpenAccess | Y | - |
dc.contributor.affiliatedAuthor | Chun, Jongsik | - |
dc.contributor.affiliatedAuthor | Yoon, Sung Roh | - |
dc.type.docType | Article | - |
dc.description.journalClass | 1 | - |
dc.subject.keywordPlus | TREE WEIGHTING METHOD | - |
dc.subject.keywordPlus | RNA GENE DATABASE | - |
dc.subject.keywordPlus | PHYLOGENETIC CLASSIFICATION | - |
dc.subject.keywordPlus | PREDICTION | - |
dc.subject.keywordPlus | PROTEIN | - |
dc.subject.keywordPlus | SEARCH | - |
dc.subject.keywordAuthor | Context modeling | - |
dc.subject.keywordAuthor | Markov processes | - |
dc.subject.keywordAuthor | Hidden Markov models | - |
dc.subject.keywordAuthor | Data models | - |
dc.subject.keywordAuthor | Maximum likelihood estimation | - |
dc.subject.keywordAuthor | Probability | - |
dc.subject.keywordAuthor | Databases | - |
dc.subject.keywordAuthor | Bioinformatics | - |
dc.subject.keywordAuthor | context-tree models | - |
dc.subject.keywordAuthor | information theory | - |
dc.subject.keywordAuthor | sequence classification | - |
dc.subject.keywordAuthor | universal probability | - |
- Appears in Collections:
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.