Publications

Detailed Information

Predicting disease predisposition patterns of the personal genome based on disease hierarchy : 질병 계층 기반 개인 유전체의 질병 위험도 예측

DC Field Value Language
dc.contributor.advisor김주한-
dc.contributor.author나영지-
dc.date.accessioned2017-07-14T06:01:33Z-
dc.date.available2017-07-14T06:01:33Z-
dc.date.issued2013-02-
dc.identifier.other000000010252-
dc.identifier.urihttps://hdl.handle.net/10371/125372-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 협동과정 생물정보학전공, 2013. 2. 김주한.-
dc.description.abstractThe advent of next-generation sequencing (NGS) technologies has had a huge impact upon functional genomics. The NGS technologies generate millions of short sequence reads per run, making it possible to sequence entire human genomes in a matter of weeks. These NGS technologies have already been employed to sequence the constitutional genomes of several individuals. Ambitious efforts like the 1000 Genomes Project and the Personal Genomes Project hope to add thousands more. The first five cancer genomes revealed thousands of novel somatic mutations and implicated new genes in tumor development and progression. Current knowledge of the genetic variants that underlie disease susceptibility, treatment response and other phenotypes will continually improve as these studies expand the catalog of DNA sequence variation in humans.
As the cost of sequencing continues to freefall, the challenge of solving the data analysis and storage problems becomes more pressing. But those issues are nothing compared to the challenge facing the clinical community who are seeking to mine the genome for clinically actionable information. However, present analytical methods are insufficient to make genetic data accessible in a clinical context, and the clinical usefulness of these data for individual patients has not been formally assessed. Here, I focus on evaluating individual predispositions to specific phenotypic traits given their genetic backgrounds.
In this dissertation, I present a computational method for associating variants in the personal genome sequencing data with predispositions to disease. The method works by ranking all variants in the personal genome as potential disease risks, and reporting MeSH terms that are significantly associated with highly ranked genes. To identify genetic variants associated diseases, I obtained high-throughput sequencing data in several cancer types (acute myeloid leukemia, bladder cancer, breast cancer, colon cancer, glioblastoma multiforme, kidney cancer, lung adenocarcinoma, lung squamous cell carcinoma, malignant melanoma, ovarian serous cystadenocarcinoma and prostate cancer) and non-cancer types (Crohns disease, focal segmental glomerulosclerosis, and retinitis pigmentosa). From disease-gene association in the OMIM, I reconstructed relations of diseases and genes in the MeSH tree structures in order to consider the human disease hierarchical structure of human disease ontology.
The results showed the distribution of mutual information in the MeSH disease category differs according to the population in the healthy people. It suggests that in order to interpret personal genome properly, we may consider population information together. In addition, MeSH disease terms are more highly ranked in the patients than healthy people. Disease-enrichment analysis showed Cancer, Neurological, Endocrine, and Immunological categories were over-represented in the patients as well as healthy people. Namely, it is possible to speculate systemic response patterns to diseases: Neuro-Endocrine-Immune Circuitry. In conclusion, although this study could not answer accurately the disease risk assessment, this study can provide data analysis scheme for the personal genome sequencing data. The scheme of this method has extendibility in genomic-based knowledge: drug-gene, environmental factor-gene and so on.
-
dc.description.tableofcontentsAbstract i
Contents iv
List of Figures vi
List of Tables ix

1. Introduction 1
1.1. Backgrounds 1
1.2. Bioinformatic approach for interpreting personal genomes 3
1.2.1. Genetic variation resources 3
1.2.2. Algorithms for the prediction of variant effects 11
1.3. Issues in assessment of the risk of disease 16
1.3.1. Type of data for genomic risk profiling 16
1.3.2. Measures to predict disease risks 18
1.4. Objectives 19

2. Materials and Methods 21
2.1. Overview of methodology 21
2.2. Data set 27
2.2.1. Personal genome sequencing data 27
2.2.2. Database for predicted functional impact of non-synonymous variants 36
2.2.3. Disease-gene association database 38
2.2.4. MeSH disease tree structure 40
2.3. Measuring similarity between the personal genome and diseases 42
2.3.1. Construction of personal genome vectors 42
2.3.2. Generation of disease vectors using disease-gene associations 43
2.3.3. Measuring similarity between the personal genome and diseases 52
2.3.4. Ranking diseases based on MeSH tree structure 55
2.3.5. Disease enrichment analysis 58

3. Results 60
3.1. Reconstruction of MeSH tree by mapping OMIM disease annotation 60
3.2. Disease predisposition patterns of healthy humans in the 1000 Genomes Project 75
3.3. Disease predisposition patterns in the disease group 92
3.4. Disease rank patterns according to the tree extension 109
3.5. Disease enrichment analysis using the Diseasome 112

4. Conclusion 106
4.1. Summary 116
4.2. Future work 117

Appendix 119
Bibliography 124
초록 131
감사의 글 134
-
dc.formatapplication/pdf-
dc.format.extent5324899 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectnext-generation sequencing-
dc.subjectMeSH tree structure-
dc.subjectdisease risk-
dc.subjectpersonal genome-
dc.subject.ddc574-
dc.titlePredicting disease predisposition patterns of the personal genome based on disease hierarchy-
dc.title.alternative질병 계층 기반 개인 유전체의 질병 위험도 예측-
dc.typeThesis-
dc.contributor.AlternativeAuthorYoung-Ji, Na-
dc.description.degreeDoctor-
dc.citation.pagesix, 134-
dc.contributor.affiliation자연과학대학 협동과정 생물정보학전공-
dc.date.awarded2013-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share