Submitted to: Biomed Central (BMC) Genomics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: July 6, 2013
Publication Date: July 11, 2013
Repository URL: http://hdl.handle.net/10113/57295
Citation: Gutierrez-Gonzalez, J.J., Tu, Z., Garvin, D.F. 2013. Analysis and annotation of the hexaploid oat seed transcriptome. Biomed Central (BMC) Genomics. 14:471. Interpretive Summary: Oats are an important component of the human food supply. They contain various compounds such as beta-glucan, tocols, and avenanthramides that improve human health status. However, compared to wheat and barley, oat crop production is lower globally and thus there has been significantly less investment in the development of a comprehensive ensemble of genome resources for breeders and geneticists to employ for oat improvement. To fill one of many critical gaps in our genome knowledge of oat, we performed massively high throughput sequencing of expressed genes at four stages of oat seed development. After computational analysis, more than 50,000 oat transcripts were identified in this first ever oat gene expression atlas, increasing the number of available oat gene transcript sequences three-fold. Genes associated with the synthesis of beta-glucan, tocols,and avenanthramides were identified, and using the gene expression atlas we were able to quantify their expression levels. Additionally, more than 4,000 new potential genetic markers for oat were identified within the expressed genes comprising the expression atlas. This comprehensive compilation of genes expressed in developing oat seeds will serve as a new tool for oat researchers both seeking to understand the mechanisms that control the accumulation of health-promoting compounds in the oat seed, and to enhance oat seed content of these compounds by breeding and genetic manipulation.
Technical Abstract: Next generation sequencing provides new opportunities to explore transcriptomes. However, challenges remain for accurate differentiation of homoeoalleles and paralogs, particularly in polyploid organisms with no supporting genome sequence. In this study, RNA-Seq was employed to generate and characterize the first gene expression atlas for hexaploid oat. The software packages Trinity and Oases were used to produce a transcript assembly from nearly 134 million 100-bp paired-end reads from developing oat seeds. Based on the quality-parameters employed, Oases assemblies were superior. The Oases 67-kmer assembly, denoted dnOST (de novo Oat Seed Transcriptome), is over 55 million nucleotides in length and the average transcript length is 1,043 nucleotides. The 74.8× sequencing depth was adequate to differentiate a large proportion of putative homoeoalleles and paralogs. To assess the robustness of dnOST, we successfully identified gene transcripts associated with the biosynthetic pathways of three compounds with health-promoting properties (avenanthramides, tocols, beta-glucans), and quantified their expression. To our knowledge, this study provides the first direct performance comparison between two major assemblers in a polyploid organism. The workflow we developed provides a useful guide for comparable analyses in other organisms. Since little effort has been directed to oat genomics, the transcript assembly developed here is a major advance. It expands the number of oat ESTs 3-fold, and constitutes the first comprehensive transcriptome study in oat. This resource will be a useful new tool both for analysis of genes relevant to nutritional enhancement of oat, and for improvement of this crop in general.