Publications
Detailed Information
Optimization and Machine Learning Algorithms for Condition-Specific Biological Network Construction and Analyses : 최적화 및 기계학습 알고리즘을 통한 조건 특이적 생물 네트워크의 구성과 분석
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 김선 | - |
dc.contributor.author | 이성민 | - |
dc.date.accessioned | 2018-11-12T00:58:53Z | - |
dc.date.available | 2018-11-12T00:58:53Z | - |
dc.date.issued | 2018-08 | - |
dc.identifier.other | 000000152132 | - |
dc.identifier.uri | https://hdl.handle.net/10371/143215 | - |
dc.description | 학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2018. 8. 김선. | - |
dc.description.abstract | Network bioinformatics has been successfully used to help reveal the complex mechanism
of cells in research and industrial domains of diverse fields such as medical science, the pharmaceutical industry, biological science, and agriculture. In network bioinformatics, myriad of studies relied on the static association between interacting nodes to construct the network or to discover novel knowledge from networks. However, biological associations are condition-specific, which means that they can change dynamically depending on the status of the cell. Thus, the needs for algorithms to address these technical challenges have been increased. In this context, I developed stochastic optimization (SO) and machine learning (ML) algorithms with sequencing data that can capture status of cells. The main focus of my work is on ensemble SO and ML approaches with network bioinformatics. The first work is miRNA-mRNA regulation inference algorithm called PlantMirnaT to construct condition-specific miRNA-mRNA bipartite network. This is a challenging problem since it needs to consider expression data of two different types, miRNA and mRNA, and target relationship between miRNA and mRNA is not clear, especially when microarray data is used. Fortunately, due to the low sequencing cost, small RNA and RNA sequencing are routinely processed and it may be able to infer regulation relationships more accurately. To fully leverage the power of sequencing data, I proposed a parameterized miRNA-mRNA expression model based on a novel idea named split-ratio, and utilized genetic algorithm and the quasi-newton method to determine optimal model parameters. The second work is to discover functional subnetworks of miRNA-mRNA regulation network that work in specific biological condition. These subnetworks are named functional miRNA-mRNA regulatory module (MRM). Mining functional MRM has exponential time complexity, thus heuristic algorithm is needed to discover optimal MRM set. My algorithm operates in two steps: 1) grouping and ordering the miRNAs and mRNAs to build per sample matrices representing miRNA-mRNA regulations, and 2) determining maximum sized modules from structured miRNA-mRNA matrices. The third work is to use deep learning method for network bioinformatics. Deep learning has shown a great potential to address the various learning problems. However, deep learning technologies conventionally use grid-like structured data, thus application of deep learning technologies to the classification of human disease subtypes is yet to be explored. Recently, graph based deep learning techniques have emerged, which becomes an opportunity to leverage analyses in network biology. I propose a hybrid model, which integrates two key components 1) graph convolution neural network (graph CNN) and 2) relation network (RN). I utilize graph CNN as a component to learn expression patterns of cooperative gene community, and RN as a component to learn associations between learned patterns. The proposed model is applied to the synthetic dataset and PAM50 breast cancer subtype classification task, the standard breast cancer subtype classification of clinical utility. In experiments of both subtype classification and patient survival analysis, my algorithm achieve the better result in both quantitative and qualitative perspectives than previous methods. | - |
dc.description.tableofcontents | Abstract i
Chapter 1 Introduction 1 1.1 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.1 Protein Coding Genes . . . . . . . . . . . . . . . . . . . . . . . 2 1.1.2 MicroRNA and Regulation of Gene Expression . . . . . . . . . 3 1.2 Issues and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter 2 PlantMirnaT: miRNA and mRNA Integrated Analysis Fully Utilizing Characteristics of Plant Sequencing Data 7 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.2 Motivation and the proposed approach . . . . . . . . . . . . . . 10 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.1 Description of Overall Learning Algorithm . . . . . . . . . . . . 12 2.2.2 Finding Putative miRNA-mRNA Target Pairs using Plant-specific Target Sequence Match Characteristics . . . . . . . . . . . . . 14 2.2.3 Filtering Out Target Relationships Using Pearsons Correlation Coefficient and Site Accessibility . . . . . . . . . . . . . . . . . 14 2.2.4 A Global miRNA-mRNA Regulation Model with Split-Ratio Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.2.5 Genetic Algorithm Based Method to Find Globally Adequate Split Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2.6 Quasi Newton Method to Finely Tune the Split Ratio . . . . . 18 2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.2 Results from the Integrated Analysis of mRNAs and miRNAs . 19 2.3.3 miRNA1425, miRNA398, and miRNA408 Characterize Drought Resistant Phenotypes in Vandana . . . . . . . . . . . . . . . . . 22 2.3.4 Two Glycolysis Related Enzyme Coding Genes Seems to be Related with Drought Tolerance in Vandana . . . . . . . . . . 22 2.3.5 Comparision with Existing Methods . . . . . . . . . . . . . . . 24 2.3.6 Highly Expressed miRNAs with Small Fold-Change Values Can Have Strong Regulation Effect . . . . . . . . . . . . . . . . . . 26 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 3 Iterative Segmented Least Square Method for Functional microRNA-mRNA Module Discovery in Breast Cancer 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.1 Materials and Data . . . . . . . . . . . . . . . . . . . . . . . . . 32 3.3.2 Overview of the Method . . . . . . . . . . . . . . . . . . . . . . 33 3.3.3 Step 1: Construction of Structured miRNA-mRNA Matrices . . 35 3.3.4 Step 2: Matrix segmentation algorithm . . . . . . . . . . . . . . 36 3.4 Results And Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.1 miRNA-mRNA Modules Can Explain Survival and Subtype of Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.4.2 Biological Role of Top Modules in the CART . . . . . . . . . . 39 3.4.3 Comparison and Validation of Module Structure . . . . . . . . 45 3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 Chapter 4 Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification 48 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4.2.1 Deep Learning on Graphs . . . . . . . . . . . . . . . . . . . . . 52 4.2.2 Relation Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 4.2.4 Localized Pattern Representation by Graph Convolution Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2.5 Learning Relation Between Graph Entities Using Relation Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.2.6 Merging Graph Convolution Layer and Relation Network . . . 59 4.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.3.1 Synthetic Experiment . . . . . . . . . . . . . . . . . . . . . . . 61 4.3.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3.3 Comparison of Classification Performance . . . . . . . . . . . . 64 4.3.4 Consistency of tSNE Visualization and PAM50 Subtype Prognosis 65 4.3.5 Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 Chapter 5 Conculsion 70 References 72 초록 86 | - |
dc.language.iso | en | - |
dc.publisher | 서울대학교 대학원 | - |
dc.subject.ddc | 621.39 | - |
dc.title | Optimization and Machine Learning Algorithms for Condition-Specific Biological Network Construction and Analyses | - |
dc.title.alternative | 최적화 및 기계학습 알고리즘을 통한 조건 특이적 생물 네트워크의 구성과 분석 | - |
dc.type | Thesis | - |
dc.description.degree | Doctor | - |
dc.contributor.affiliation | 공과대학 컴퓨터공학부 | - |
dc.date.awarded | 2018-08 | - |
- Appears in Collections:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.