NASCUP: Nucleic Acid Sequence Classification by Universal Probability

Kwon, Sunyoung; Kim, Gyuwan; Lee, Byunghan; Chun, Jongsik; Yoon, Sung Roh; Kim, Young-Han

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

NASCUP: Nucleic Acid Sequence Classification by Universal Probability

DC Field	Value	Language
dc.contributor.author	Kwon, Sunyoung	-
dc.contributor.author	Kim, Gyuwan	-
dc.contributor.author	Lee, Byunghan	-
dc.contributor.author	Chun, Jongsik	-
dc.contributor.author	Yoon, Sung Roh	-
dc.contributor.author	Kim, Young-Han	-
dc.date.accessioned	2022-08-22T09:09:55Z	-
dc.date.available	2022-08-22T09:09:55Z	-
dc.date.created	2022-07-08	-
dc.date.created	2022-07-08	-
dc.date.issued	2021-11	-
dc.identifier.citation	IEEE Access, Vol.9, pp.162779-162791	-
dc.identifier.issn	2169-3536	-
dc.identifier.uri	https://hdl.handle.net/10371/184333	-
dc.description.abstract	Nucleic acid sequence classification is a fundamental task in the field of bioinformatics. Due to the increasing amount of unlabeled nucleotide sequences, fast and accurate classification of them on a large scale has become crucial. In this work, we developed NASCUP, a new classification method that captures statistical structures of nucleotide sequences by compact context-tree models and universal probability from information theory. A comprehensive experimental study involving nine public databases for functional non-coding RNA, microbial taxonomy and coding/non-coding RNA classification demonstrates the advantages of NASCUP over widely-used alternatives in efficiency, accuracy, and scalability across all datasets considered. NASCUP achieved BLAST-like classification accuracy consistently for several large-scale databases in orders-of-magnitude reduced runtime, and was applied to other bioinformatics tasks such as outlier detection and synthetic sequence generation.	-
dc.language	영어	-
dc.publisher	Institute of Electrical and Electronics Engineers Inc.	-
dc.title	NASCUP: Nucleic Acid Sequence Classification by Universal Probability	-
dc.type	Article	-
dc.citation.journaltitle	IEEE Access	-
dc.identifier.wosid	000730449500001	-
dc.identifier.scopusid	2-s2.0-85119427713	-
dc.citation.endpage	162791	-
dc.citation.startpage	162779	-
dc.citation.volume	9	-
dc.description.isOpenAccess	Y	-
dc.contributor.affiliatedAuthor	Chun, Jongsik	-
dc.contributor.affiliatedAuthor	Yoon, Sung Roh	-
dc.type.docType	Article	-
dc.description.journalClass	1	-
dc.subject.keywordPlus	TREE WEIGHTING METHOD	-
dc.subject.keywordPlus	RNA GENE DATABASE	-
dc.subject.keywordPlus	PHYLOGENETIC CLASSIFICATION	-
dc.subject.keywordPlus	PREDICTION	-
dc.subject.keywordPlus	PROTEIN	-
dc.subject.keywordPlus	SEARCH	-
dc.subject.keywordAuthor	Context modeling	-
dc.subject.keywordAuthor	Markov processes	-
dc.subject.keywordAuthor	Hidden Markov models	-
dc.subject.keywordAuthor	Data models	-
dc.subject.keywordAuthor	Maximum likelihood estimation	-
dc.subject.keywordAuthor	Probability	-
dc.subject.keywordAuthor	Databases	-
dc.subject.keywordAuthor	Bioinformatics	-
dc.subject.keywordAuthor	context-tree models	-
dc.subject.keywordAuthor	information theory	-
dc.subject.keywordAuthor	sequence classification	-
dc.subject.keywordAuthor	universal probability	-

Appears in Collections:

College of Natural Sciences (자연과학대학)
- Dept. of Biological Sciences (생명과학부)
  - Journal Papers (저널논문_생명과학부)

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Journal Papers (저널논문_전기·정보공학부)

Files in This Item:: There are no files associated with this item.

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share