Interpretation of personal genome sequencing data in terms of disease ranks based on mutual information

Na, Young-Ji; Sohn, Kyung-Ah; Kim, Ju Han

doi:10.1186/1755-8794-8-S2-S4

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Interpretation of personal genome sequencing data in terms of disease ranks based on mutual information

Cited 2 time in Web of Science Cited 5 time in Scopus

Export

Authors: Na, Young-Ji; Sohn, Kyung-Ah; Kim, Ju Han

Issue Date: 2015-05-29

Publisher: BioMed Central

Citation: BMC Medical Genomics, 8(Suppl 2):S4

Description: This is an Open Access article distributed under the terms of the Creative Commons
Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in
any medium, provided the original work is properly cited.

Abstract: Abstract

Background
The rapid advances in genome sequencing technologies have resulted in an unprecedented number of genome variations being discovered in humans. However, there has been very limited coverage of interpretation of the personal genome sequencing data in terms of diseases.

Methods
In this paper we present the first computational analysis scheme for interpreting personal genome data by simultaneously considering the functional impact of damaging variants and curated disease-gene association data. This method is based on mutual information as a measure of the relative closeness between the personal genome and diseases. We hypothesize that a higher mutual information score implies that the personal genome is more susceptible to a particular disease than other diseases.

Results
The method was applied to the sequencing data of 50 acute myeloid leukemia (AML) patients in The Cancer Genome Atlas. The utility of associations between a disease and the personal genome was explored using data of healthy (control) people obtained from the 1000 Genomes Project. The ranks of the disease terms in the AML patient group were compared with those in the healthy control group using "Leukemia, Myeloid, Acute" (C04.557.337.539.550) as the corresponding MeSH disease term.
The mutual information rank of the disease term was substantially higher in the AML patient group than in the healthy control group, which demonstrates that the proposed methodology can be successfully applied to infer associations between the personal genome and diseases.

Conclusions
Overall, the area under the receiver operating characteristics curve was significantly larger for the AML patient data than for the healthy controls. This methodology could contribute to consequential discoveries and explanations for mining personal genome sequencing data in terms of diseases, and have versatility with respect to genomic-based knowledge such as drug-gene and environmental-factor-gene interactions.

Language: English

URI: https://hdl.handle.net/10371/100681

DOI: https://doi.org/10.1186/1755-8794-8-S2-S4

Files in This Item:

12920_2015_Article_554.pdf 1.29 MB

Appears in Collections:

College of Medicine/School of Medicine (의과대학/대학원)
- Dept. of Medicine (의학과)
  - Journal Papers (저널논문_의학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share