S-Space College of Medicine/School of Medicine (의과대학/대학원) Dept. of Medicine (의학과) Theses (Ph.D. / Sc.D._의학과)
Graph- and kernel-based integrative analyses of multi-layers of heterogeneous genomic data
이종 다계층 유전체 정보의 그래프 기반 통합 및 커널 기반 통합 분석 연구
- 의과대학 의학과
- Issue Date
- 서울대학교 대학원
- Integrative analysis; Multi-layers of genomic data; Clinical outcome prediction; Glioblastoma multiforme; Serous cystadenocarcinoma
- 학위논문 (박사)-- 서울대학교 대학원 : 의학과 분자유전체의학 전공, 2013. 2. 김주한.
- Introduction: Cancer is a complex disease, which can be dysregulated through multiple mechanisms. Therefore, no single level of genomic data fully elucidates tumor behavior since there are many genomic variations within or between levels in a biological system such as copy number variants, DNA methylation, alternative splicing, miRNA regulation, post translational modification, etc. Nowadays, a number of heterogeneous types of data have become more available (i.e., TCGA, the Cancer Genome Atlas) which are generated from multiple molecular levels of omics dimensions from genome to phenome.
Methods: Given multi-levels of data, information from a level to another may lead to some clues that help to uncover an unknown biological knowledge. Thus, integration of different levels of data can aid in extracting new knowledge by drawing an integrative conclusion from many pieces of information collected from diverse types of genomic data. In the meantime, it is expected that the next attempt is more focused on how to utilize the information from inter-relation, the relation between different levels: from the genome level to epigenome, transcriptome, proteome, and further stretched to the phenome level. In this study, the prototypes of the research schemes for integrative analysis of multi-layers and heterogeneous genomic data were introduced and discussed.
Results: These schemes were exemplified based on the pilot experimental results on the prediction problem of cancer clinical outcomes using the TCGA data. For glioblastoma multiforme, all clinical outcomes had a better the area under the curve (AUC) of receiver operating characteristic when integrating multi-layers of genomic data, 0.876 for survival to 0.832 for recurrence. Moreover, the better AUCs were achieved from the integration approach for all clinical outcomes in ovarian cancer as well, ranging from 0.787 to 0.893. In addition, based on our results, an accuracy of prediction model with inter-relationship increases because of incorporation of information fused over genomic dataset from gene expression and genomic knowledge from inter-relation between miRNA and gene expression.
Conclusions: I found that the opportunity for success in prediction of clinical outcomes in cancer was increased when the prediction was based on the integration of multi-layers of genomic data and genomic knowledge. This study is expecting to improve comprehension of the molecular pathogenesis and underlying biology of both cancer types.