Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #327955

Title: Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing

Author
item WANG, BO - Cold Spring Harbor Laboratory
item TSENG, ELIZABETH - Pacific Biosciences Inc
item REGULSKI, MICHAEL - Cold Spring Harbor Laboratory
item CLARK, TYSON - Pacific Biosciences Inc
item HON, TING - Pacific Biosciences Inc
item JIAO, YINPING - Cold Spring Harbor Laboratory
item LU, ZHENYUAN - Cold Spring Harbor Laboratory
item OLSON, ANDREW - Cold Spring Harbor Laboratory
item STEIN, JOSHUA - Cold Spring Harbor Laboratory
item Ware, Doreen

Submitted to: Nature Communications
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/20/2016
Publication Date: 6/24/2016
Citation: Wang, B., Tseng, E., Regulski, M., Clark, T., Hon, T., Jiao, Y., Lu, Z., Olson, A., Stein, J., Ware, D. 2016. Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing. Nature Communications. 7:11708 doi: 10.1038/ncomms11708.

Interpretive Summary: Corn (maize) is one of the most important crops in the United States and the world, providing livestock feed and a wide range of food and industrial products. Continued genetic improvement of maize through application of genomics research and breeding is essential for ensuring food and energy security in the future.

Technical Abstract: Zea mays is an important crop species and genetic model for elucidating transcriptional networks in plants. Uncertainties about the complete structure of mRNA transcripts, particularly with respect to alternatively spliced isoforms, limit the progress of research in this system. In this study, we used single-molecule sequencing technology to investigate the maize transcriptome. Intact full-length cDNAs from six tissues of the maize inbred line B73 were barcoded, pooled, size-fractionated (<1 kb, 1-2 kb, 3-5 kb, 4-6 kb, and 5-10 kb), and sequenced on the PacBio RS II platform with P6-C4 chemistry. The resultant 111,151 transcripts captured ~70% of the annotated genes of the maize RefGenV3 genome assembly. A large proportion of transcripts (57%) represented novel, sometimes tissue-specific, isoforms of known genes, and 3% corresponded to novel gene loci. In other cases, the identified transcripts have improved existing gene models. To validate transcript structures we checked for occurrence of each splice-junction within high-depth Illumina reads generated from matched tissues. Averaging across all six tissues, 90% of splice-junctions were well supported by short-reads in matched tissues. In addition, we identified a large number of novel long non-coding RNAs (IncRNAs) and fusion transcripts, and found that DNA methylation plays important roles in generation of various isoforms. Our results show that the characterization of the maize B73 transcriptome is far from complete, and that maize gene expression is more complex than previously thought.