S-Space College of Natural Sciences (자연과학대학) Program in Bioinformatics (협동과정-생물정보학전공) Theses (Master's Degree_협동과정-생물정보학전공)
Understanding the genome of living organism based on the second generation sequencing
차세대 염기서열 분석방법을 이용한 생물 유전체의 이해
- 자연과학대학 협동과정 생물정보학전공
- Issue Date
- 서울대학교 대학원
- Next generation sequencing; second generation sequencing; de novo assembly; genome assembly
- 학위논문 (석사)-- 서울대학교 대학원 : 협동과정 생물정보학전공, 2014. 2. 김희발.
- These studies are mainly about the rebuilding genome sequence of living organism using de novo assembly based on the second generation sequencing technologies and understanding the gene level features of organisms. Even though the next generation sequencing, especially the second generation sequencing, make the genome project can be conducted in reasonable price, assembling the short read from the second generation sequencing is challenging. Various programs which have its unique characteristics are available but one program or pipeline cannot be the best choice at any times. Therefore, researchers who want to rebuild the genome sequence using de novo assembly have to choose the best combination of programs and pipeline for specific data.
In chapter 2, I make the efficient combination of programs for the de novo assembly of microbes and the finished level genome assembly of the probiotic candidates had been conducted using short reads from two sequencing technologies. Based on the result of assembly, I found the potential risk as a useful probiotic strain.
In chapter 3, minke whale genome assembly had been conducted using low coverage re-sequencing data. I found the efficient genome assembly pipeline using various open source programs which showed better performance than the assembly result of the expensive commercial program. And contig extension and bridging were conducted to combine the result of assembly from different samples.
In chapter 4, assembly of unaligned reads from short read alignment to the reference genome was conducted to identify the unique sequence and gene contents of Korean Native Chicken (KNC) samples. Based on the unaligned reads assembly and gene prediction, KNC specific genes and sequences were identified for further analysis.
Through these studies, I trained making some efficient genome assembly pipelines suitable for specific data and learned the way to understand the characteristics of living organisms based on the assembly and gene level features.