S-Space College of Natural Sciences (자연과학대학) Dept. of Statistics (통계학과) Theses (Master's Degree_통계학과)
Identification of genes related to cancer through targeted sequencing data analysis
표적 시퀀싱 자료를 이용한 암 관련 유전자 발굴
- 자연과학대학 통계학과
- Issue Date
- 서울대학교 대학원
- NGS data analysis; small sample association study; Fisher's exact test; CMH statistic; IPMN
- 학위논문 (석사)-- 서울대학교 대학원 : 통계학과, 2017. 2. 박태성.
- Recent statistical methods for next generation sequencing (NGS) data have been successfully applied to identifying rare genetic variants associated with certain diseases. Note that most commonly used methods such as burden tests and variance-component tests rely on large sample size. However, due to a high cost of sequencing, small sample size sequencing data are popularly generated. Most existing methods are not appropriate to handle sequencing data with small samples.
In this work, we propose a new exact association test for sequencing data which does not require a large sample approximation. Our method is based upon the generalized Cochran-Mantel-Haenszel (CMH) statistic. We applied our method to NGS data from Intraductal papillary mucinous neoplasm (IPMN) patients. These IPMN patients have the unique pancreatic neoplasm which could turn into an invasive and hard-to-treat pancreatic cancer. Through this application, we successfully identified susceptible genes that are associated with the progression of IPMN to pancreatic cancer.