S-Space College of Agriculture and Life Sciences (농업생명과학대학) Program in Agricultural Genomics (협동과정-농생명유전체학전공) Theses (Master's Degree_협동과정-농생명유전체학전공)
NLR-Finder: An Easy and Efficient Annotation Tool for the NLR Superfamily in Plant Genomes
- 농업생명과학대학 협동과정농생명유전체학전공
- Issue Date
- 서울대학교 대학원
- annotation ; nucleotide-binding and leucine-rich repeat (NLR) genes ; disease resistance genes ; bioinformatics tool
- 학위논문 (석사)-- 서울대학교 대학원 : 협동과정농생명유전체학전공, 2017. 2. 최도일.
- Gene annotation is an essential process to identify gene structures and define biological functions. It is an important step for subsequent analyses including gene cloning and identification of genes for agricultural traits. However, current gene annotation misrepresents the whole gene repertoire due to biased gene model construction. Nucleotide-binding and leucine-rich repeat (NLR) superfamily is one of the poorly annotated gene families in plants. The NLR family tends to be clustered in genomes by segmental and tandem duplications, which makes the gene annotation challenging. The NLR-Finder was developed for unbiased genome-wide identification of the NLR superfamily in assembled plant genomes. The NLR-Finder firstly detects candidate NLR gene regions by extending 30 kb to both sides of all the identified NB-ARC domain regions. Secondly, evidence-based NLR genes are predicted by aligning published proteins and transcriptome sequences to the candidate gene regions. Thirdly, additional NLR genes are extracted using an ab initio prediction approach. Lastly, final NLR gene models are generated by integration of the evidence- and ab initio-based NLR genes. The re-annotation was performed using the NLR-Finder on 17 different plant genomes. On average, public annotation tools identified about 310 genes, whereas the NLR-Finder annotated about 497 genes. In Gossypium hirsutum and Vigna radiata, the number of re-annotated genes tripled compared to that of publicly available data. The re-annotated genes were successfully validated by comparing with high-quality annotations of Arabidopsis thaliana, Brachypodium distachyon, and Solanum lycopersicum. This study demonstrated that the NLR-Finder provides an easy-to-use and efficient method to annotate the NLR superfamily in plant genomes.