S-Space College of Medicine/School of Medicine (의과대학/대학원) Dept. of Biomedical Sciences (대학원 의과학과) Theses (Ph.D. / Sc.D._의과학과)
A de novo assembled and completely phased AK1 genome for establishing an Asian reference standard
아시안 표준유전체 구축을 위한 AK1 유전체 신생조합과 단배체 위상 규명
- 의과대학 의과학과
- Issue Date
- 서울대학교 대학원
- de novo whole genome assembly; haplotype phasing; single molecule real time technology; bacterial articifial chromosomes; human reference genome; precision medicine
- 학위논문 (박사)-- 서울대학교 대학원 : 의과학과, 2017. 2. 서정선.
- Advances in genome assembly and phasing provide an opportunity to investigate the diploid architecture of the human genome and reveal the full range of structural variation (SV) across ethnic groups. Here, we report the de novo assembly and de novo haplotype phasing of an Asian individual using single molecule real-time (SMRT) sequencing, optical mapping, and bacterial artificial chromosome (BAC) sequencing approaches. Single molecule sequencing coupled with optical mapping generated a highly contiguous assembly with contig N50 of 17.9 Mb and scaffold N50 of 44.8Mb, that resolves 8 chromosomal arms with single scaffolds. High concordance between the assembly and paired end sequences from 56,794 BAC clones provides strong support for the robustness of the assembly. We identify 18,210 structural variations (SVs) through direct comparison of the assembly with the human reference, revealing thousands of new breakpoints that have not been reported before. Many of the novel insertions are reflected in the transcriptome and are shared across the Asian population. We performed haplotype phasing of the assembly with short reads, long reads, linked reads from whole genome sequencing and with short reads from 31,719 BAC clones achieving unprecedented levels of phased block contiguity, with N50 of 11.5 Mb. Haplotigs assembled from SMRT reads assigned to haplotypes on phased blocks covered 89% of genes. The haplotigs accurately characterized the hypervariable MHC region as well as revealing disease risk allele configuration in clinically relevant genes including CYP2D6. This work presents the most contiguous diploid human genome assembly to date, with extensive investigation of novel and Asian specific SVs, and high quality haplotyping of clinically relevant alleles for personalized medicine.