Genetic association tests for the heaped data

최해원

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Genetic association tests for the heaped data

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 최해원

Advisor: 박태성

Major: 자연과학대학 통계학과

Issue Date: 2015-02

Publisher: 서울대학교 대학원

Keywords: Heaped data ; Cigarette per day (CPD) ; Genome-Wide Association Study (GWAS) ; Self-reported survey

Description: 학위논문 (석사)-- 서울대학교 대학원 : 통계학과, 2015. 2. 박태성.

Abstract: In self-reported surveys, subjects tend to recall the counts of events as particularly multiples of certain numbers. For example, in studies of smoking behavior cigarette counts per day are heaped such as 0, half pack, one pack, one and half packs, two packs and so forth. Because of the error of memory, the frequency of the values ending with 0 or 5 is higher than that of the true distribution. These data is called heaped data. Analysis of such heaped data has been a challenge owing to the reporting bias and the difficulty in estimating the appropriate distribution for the heaped data. Therefore, it is hard to fit a model via the standard maximum likelihood estimation when the interest lies in association studies between the heaped dependent variable and other covariates of interest. In this study, we are interested in identifying genetic variants such as single nucleotide polymorphism (SNP) for a heaped data such as the cigarette per day (CPD). We first review previously proposed approaches applicable to CPD data in which the heaped data is treated as a dependent variable and the SNP as an ordinal independent variable. We then consider an alternative calibration modelling approach to the association test for heaped data. That is, we consider a reverse model regarding the SNP as an ordinal dependent variable and the heaped data as an independent variable. Unlike the standard modelling approach, this calibration modelling approach becomes robust to the distributional assumption of heaped data. For handling ordinal nature of SNPs, we fit a cumulative logit model in our calibration model. The significant SNPs can be identified from the model. We applied our calibration modelling approach to CPD data from Korean Association Resource project data of 4,183 male samples. Through simulation studies, we investigated performance of the proposed method and compared its performance with other competing approaches.

Language: Korean

URI: https://hdl.handle.net/10371/131300

Files in This Item:

000000026067.pdf 0.91 MB

Appears in Collections:

College of Natural Sciences (자연과학대학)
- Dept. of Statistics (통계학과)
  - Theses (Master's Degree_통계학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share