Publications

Detailed Information

Prioritizing biological pathways by recognizing context in time-series gene expression data : 시계열 유전자 발현량 데이터의 주제 분석을 통한 생물학적 중요 패스웨이 우선순위화 기법

DC Field Value Language
dc.contributor.advisor김선-
dc.contributor.author이주상-
dc.date.accessioned2018-05-29T03:33:04Z-
dc.date.available2018-05-29T03:33:04Z-
dc.date.issued2018-02-
dc.identifier.other000000149786-
dc.identifier.urihttps://hdl.handle.net/10371/141561-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2018. 2. 김선.-
dc.description.abstractThe primary goal of pathway analysis using transcriptome data is to find significantly perturbed pathways. However, pathway analysis is not always successful in identifying pathways that are truly relevant to the context under study. A major reason for this difficulty is that a single gene is involved in multiple pathways. In the KEGG pathway database, there are 146 genes, each of which is involved in more than 20 pathways. Thus activation of even a single gene will result in activation of many pathways. This complex relationship often makes the pathway analysis very difficult. While much more powerful pathway analysis method is necessary, a readily available alternative way is to incorporate the literature information. In this study, I propose a novel approach for prioritizing pathways by combining results from both pathway analysis tools and literature information. The basic idea is as follows. Whenever there are enough articles that provide evidence on which pathways are relevant to the context, it can be assured that the pathways are indeed related to the context, which is termed as relevance in this paper. However, if there are few or no articles reported, then researcher should rely on the results from the pathway analysis tools, which is termed as significance in this paper. I realized this concept as an algorithm by introducing Context Score and Impact Score and then combining the two into a single score. My method ranked truly relevant pathways significantly higher than existing pathway analysis tools in experiments with two data sets. My novel framework was implemented as ContextTRAP by utilizing two existing tools, TRAP and BEST. ContextTRAP will be a useful tool for the pathway based analysis of gene expression data since the user can specify the context of the biological experiment in a set of keywords. The web version of ContextTRAP is available at http://biohealth.snu.ac.kr/software/contextTRAP.-
dc.description.tableofcontentsI. Introduction 1
1.1 Background 1
1.2 Motivation 3
II. Methods 7
2.1 Context Score 8
2.2 Impact Score 11
2.3 Discovery rate 12
2.4 Pathway set enrichment analysis 13
III. Results 15
3.1 Data processing 15
3.2 The effect of relevance between keyword and the context of data 17
3.3 Accuracy of Discovery rate estimation 18
3.4 How much improvement is achieved in detecting relevant pathways in comparison with the original version of TRAP 20
3.5 Comparison with other pathway analysis methods 21
3.6 Biological perspective 24
IV. Conclusion 27
V. Discussion 29
5.1 The limitation of knowledge 29
Bibliography 32
요 약 37
-
dc.formatapplication/pdf-
dc.format.extent7072633 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectPathway analysis-
dc.subjectPathway prioritization-
dc.subjectLiterature information-
dc.subject.ddc621.39-
dc.titlePrioritizing biological pathways by recognizing context in time-series gene expression data-
dc.title.alternative시계열 유전자 발현량 데이터의 주제 분석을 통한 생물학적 중요 패스웨이 우선순위화 기법-
dc.typeThesis-
dc.contributor.AlternativeAuthorJusang Lee-
dc.description.degreeMaster-
dc.contributor.affiliation공과대학 컴퓨터공학부-
dc.date.awarded2018-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share