S-Space College of Engineering/Engineering Practice School (공과대학/대학원) Dept. of Electrical and Computer Engineering (전기·정보공학부) Theses (Ph.D. / Sc.D._전기·정보공학부)
Denoising and Interaction Learning of Biological Data
생체 자료 오류 정정 및 관계 학습
- 공과대학 전기·컴퓨터공학부
- Issue Date
- 서울대학교 대학원
- machine learning; deep learning; end-to-end learning; parallelization; sequence error; sequence interaction; time series; miRNA target
- 학위논문 (박사)-- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2018. 2. 윤성로.
- Since the Human Genome Project was completed, enormous biological data have been accumulated as an attempt to understand the biological mechanisms of human. However, errors induced during the sequencing procedures and unrevealed inherent features of biological data for inferring their interactions arouse the necessity of large-scale data-driven applications. In this regard, this dissertation exploits the recent advances in machine learning and artificial intelligence techniques that have shown their success in time series sequence learning, including natural language processing and neural machine translation, to improve the reliability and computational performance of investigating
This dissertation discusses three issues in sequence analysis and proposes methodologies to overcome them. First, to alleviate the error-prone nature of sequence reads from next-generation sequencing (NGS), we present an information theoretic approach for correcting sequence errors from various sequencers. Next, we show a generalized multi-graphics processing units (GPUs) accelerated sequence denoiser to address the computational challenges of denoising high-throughput sequences. Finally, we describe an end-to-end machine learning framework for robust sequence (e.g., miRNA) target prediction to boost the sensitivity without the laborious manual feature extraction procedure.
In summary, this dissertation proposes a set of methodologies on the basis of machine learning algorithms to handle biological sequences that can boost the reliability of downstream analysis.