Publications

Detailed Information

Optimization and Machine Learning Algorithms for Condition-Specific Biological Network Construction and Analyses : 최적화 및 기계학습 알고리즘을 통한 조건 특이적 생물 네트워크의 구성과 분석

DC Field Value Language
dc.contributor.advisor김선-
dc.contributor.author이성민-
dc.date.accessioned2018-11-12T00:58:53Z-
dc.date.available2018-11-12T00:58:53Z-
dc.date.issued2018-08-
dc.identifier.other000000152132-
dc.identifier.urihttps://hdl.handle.net/10371/143215-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2018. 8. 김선.-
dc.description.abstractNetwork bioinformatics has been successfully used to help reveal the complex mechanism

of cells in research and industrial domains of diverse fields such as medical

science, the pharmaceutical industry, biological science, and agriculture. In network

bioinformatics, myriad of studies relied on the static association between interacting

nodes to construct the network or to discover novel knowledge from networks. However,

biological associations are condition-specific, which means that they can change

dynamically depending on the status of the cell. Thus, the needs for algorithms to

address these technical challenges have been increased. In this context, I developed

stochastic optimization (SO) and machine learning (ML) algorithms with sequencing

data that can capture status of cells. The main focus of my work is on ensemble SO

and ML approaches with network bioinformatics.

The first work is miRNA-mRNA regulation inference algorithm called PlantMirnaT

to construct condition-specific miRNA-mRNA bipartite network. This is a challenging

problem since it needs to consider expression data of two different types,

miRNA and mRNA, and target relationship between miRNA and mRNA is not clear,

especially when microarray data is used. Fortunately, due to the low sequencing cost,

small RNA and RNA sequencing are routinely processed and it may be able to infer

regulation relationships more accurately. To fully leverage the power of sequencing

data, I proposed a parameterized miRNA-mRNA expression model based on a novel

idea named split-ratio, and utilized genetic algorithm and the quasi-newton method

to determine optimal model parameters.

The second work is to discover functional subnetworks of miRNA-mRNA regulation

network that work in specific biological condition. These subnetworks are named

functional miRNA-mRNA regulatory module (MRM). Mining functional MRM has

exponential time complexity, thus heuristic algorithm is needed to discover optimal

MRM set. My algorithm operates in two steps: 1) grouping and ordering the miRNAs

and mRNAs to build per sample matrices representing miRNA-mRNA regulations,

and 2) determining maximum sized modules from structured miRNA-mRNA matrices.

The third work is to use deep learning method for network bioinformatics. Deep

learning has shown a great potential to address the various learning problems. However,

deep learning technologies conventionally use grid-like structured data, thus

application of deep learning technologies to the classification of human disease subtypes

is yet to be explored. Recently, graph based deep learning techniques have

emerged, which becomes an opportunity to leverage analyses in network biology. I

propose a hybrid model, which integrates two key components 1) graph convolution

neural network (graph CNN) and 2) relation network (RN). I utilize graph CNN as

a component to learn expression patterns of cooperative gene community, and RN as

a component to learn associations between learned patterns. The proposed model is

applied to the synthetic dataset and PAM50 breast cancer subtype classification task,

the standard breast cancer subtype classification of clinical utility. In experiments of

both subtype classification and patient survival analysis, my algorithm achieve the

better result in both quantitative and qualitative perspectives than previous methods.
-
dc.description.tableofcontentsAbstract i

Chapter 1 Introduction 1

1.1 Biological Background . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.1 Protein Coding Genes . . . . . . . . . . . . . . . . . . . . . . . 2

1.1.2 MicroRNA and Regulation of Gene Expression . . . . . . . . . 3

1.2 Issues and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Outline of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 6

Chapter 2 PlantMirnaT: miRNA and mRNA Integrated Analysis

Fully Utilizing Characteristics of Plant Sequencing Data 7

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1.2 Motivation and the proposed approach . . . . . . . . . . . . . . 10

2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.1 Description of Overall Learning Algorithm . . . . . . . . . . . . 12

2.2.2 Finding Putative miRNA-mRNA Target Pairs using Plant-specific

Target Sequence Match Characteristics . . . . . . . . . . . . . 14

2.2.3 Filtering Out Target Relationships Using Pearsons Correlation

Coefficient and Site Accessibility . . . . . . . . . . . . . . . . . 14

2.2.4 A Global miRNA-mRNA Regulation Model with Split-Ratio

Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.5 Genetic Algorithm Based Method to Find Globally Adequate

Split Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.6 Quasi Newton Method to Finely Tune the Split Ratio . . . . . 18

2.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3.2 Results from the Integrated Analysis of mRNAs and miRNAs . 19

2.3.3 miRNA1425, miRNA398, and miRNA408 Characterize Drought

Resistant Phenotypes in Vandana . . . . . . . . . . . . . . . . . 22

2.3.4 Two Glycolysis Related Enzyme Coding Genes Seems to be

Related with Drought Tolerance in Vandana . . . . . . . . . . 22

2.3.5 Comparision with Existing Methods . . . . . . . . . . . . . . . 24

2.3.6 Highly Expressed miRNAs with Small Fold-Change Values Can

Have Strong Regulation Effect . . . . . . . . . . . . . . . . . . 26

2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Chapter 3 Iterative Segmented Least Square Method for Functional

microRNA-mRNA Module Discovery in Breast Cancer 29

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.1 Materials and Data . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3.2 Overview of the Method . . . . . . . . . . . . . . . . . . . . . . 33

3.3.3 Step 1: Construction of Structured miRNA-mRNA Matrices . . 35

3.3.4 Step 2: Matrix segmentation algorithm . . . . . . . . . . . . . . 36

3.4 Results And Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.1 miRNA-mRNA Modules Can Explain Survival and Subtype of

Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

3.4.2 Biological Role of Top Modules in the CART . . . . . . . . . . 39

3.4.3 Comparison and Validation of Module Structure . . . . . . . . 45

3.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Chapter 4 Hybrid Approach of Relation Network and Localized Graph

Convolutional Filtering for Breast Cancer Subtype Classification

48

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

4.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4.2.1 Deep Learning on Graphs . . . . . . . . . . . . . . . . . . . . . 52

4.2.2 Relation Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . 53

4.2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.2.4 Localized Pattern Representation by Graph Convolution Neural

Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2.5 Learning Relation Between Graph Entities Using Relation Network

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

4.2.6 Merging Graph Convolution Layer and Relation Network . . . 59

4.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

4.3.1 Synthetic Experiment . . . . . . . . . . . . . . . . . . . . . . . 61

4.3.2 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

4.3.3 Comparison of Classification Performance . . . . . . . . . . . . 64

4.3.4 Consistency of tSNE Visualization and PAM50 Subtype Prognosis 65

4.3.5 Survival Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 68

4.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

Chapter 5 Conculsion 70

References 72

초록 86
-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subject.ddc621.39-
dc.titleOptimization and Machine Learning Algorithms for Condition-Specific Biological Network Construction and Analyses-
dc.title.alternative최적화 및 기계학습 알고리즘을 통한 조건 특이적 생물 네트워크의 구성과 분석-
dc.typeThesis-
dc.description.degreeDoctor-
dc.contributor.affiliation공과대학 컴퓨터공학부-
dc.date.awarded2018-08-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share