Publications

Detailed Information

A non-parametric and information theoretic algorithm for identifying differentially expressed genes in multiclass RNA-seq samples : 다중 부류의 RNA-seq 표본에서 다르게 발현된 유전자를 확인하기 위한 비모수, 정보 이론적 알고리즘

DC Field Value Language
dc.contributor.advisor김선-
dc.contributor.author안재현-
dc.date.accessioned2017-07-14T02:54:02Z-
dc.date.available2017-07-14T02:54:02Z-
dc.date.issued2014-02-
dc.identifier.other000000016793-
dc.identifier.urihttps://hdl.handle.net/10371/123030-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 2. 김선.-
dc.description.abstractGene expression in the whole cell can be routinely measured by microarray technologies or recently by using sequencing technologies. Using these technologies, identifying Differentially Expressed Genes (DEGs) among multiple phenotypes is one of the most important tasks in biology. Thus many methods for detecting DEGs between two groups has been developed. For example, T-test and relative entropy are used for detecting the difference between two probability distributions. When more than two phenotypes are considered, these methods are not applicable and other methods such as ANOVA F-test and Kruskal-Wallis are used for finding DEGs in the multiclass data. However, ANOVA F-test assumes a normal distribution and it is not designed to identify DEGs where gene are expressed distinctively in each of phenotypes. Kruskal-Wallis method, a non-parametric method, is more robust but sensitive to outliers. This thesis proposes a non-parametric and information theoretical approach for identifying DEGs. Our method can identify DEGs in the multiple class data and is less sensitive to outliers. In extensive experiments with simulated and real data, our method outperformed existing tools. In addition, a web service is implemented for the analysis of multi-class data: http://biohealth.snu.ac.kr/software/degselection-
dc.description.tableofcontentsAbstract i
Contents iii
List of Figures v
List of Tables vii
Chapter 1 Introduction 1
1.1 Related works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Need for a non-parametric method . . . . . . . . . . . . . 4
1.2.2 Distinguishing several groups at once . . . . . . . . . . . . 5
1.2.3 Need to be robust for outliers . . . . . . . . . . . . . . . . 5
Chapter 2 Methods 7
2.1 Overview of a proposed method . . . . . . . . . . . . . . . . . . . 7
2.2 Preprocessing data . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.3 Dierence analysis using mutual information . . . . . . . . . . . 9
2.4 Estimating P-value . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Chapter 3 Results 11
3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.1 Simulated data . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1.2 Real data . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2 Classication results . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.1 Simulated data . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Real data . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3 Biological interpretation . . . . . . . . . . . . . . . . . . . . . . . 15
3.3.1 Rice data . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.3.2 Breast cancer data . . . . . . . . . . . . . . . . . . . . . . 18
Chapter 4 Conclusion 21
”요약 25
Acknowledgements 27
Chapter 5 Appendix 28
-
dc.formatapplication/pdf-
dc.format.extent2452724 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectDifferentially expressed genes-
dc.subjectinformation theoretic approach-
dc.subjectmulticlass-
dc.subjectRNA-seq-
dc.subject.ddc621-
dc.titleA non-parametric and information theoretic algorithm for identifying differentially expressed genes in multiclass RNA-seq samples-
dc.title.alternative다중 부류의 RNA-seq 표본에서 다르게 발현된 유전자를 확인하기 위한 비모수, 정보 이론적 알고리즘-
dc.typeThesis-
dc.description.degreeMaster-
dc.citation.pages29-
dc.contributor.affiliation공과대학 전기·컴퓨터공학부-
dc.date.awarded2014-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share