Skip to main content
ARS Home » Pacific West Area » Albany, California » Western Regional Research Center » Crop Improvement and Genetics Research » Research » Publications at this Location » Publication #370608

Research Project: GrainGenes: Enabling Data Access and Sustainability for Small Grains Researchers

Location: Crop Improvement and Genetics Research

Title: Genome-wide discovery of G-quadruplexes in wheat: distribution and putative functional roles

Author
item CAGIRICI, H. BUSRA - Oak Ridge Institute For Science And Education (ORISE)
item Sen, Taner

Submitted to: G3, Genes/Genomes/Genetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 4/13/2020
Publication Date: 6/1/2020
Citation: Cagirici, H., Sen, T.Z. 2020. Genome-wide discovery of G-quadruplexes in wheat: distribution and putative functional roles. G3, Genes/Genomes/Genetics. 10(6). https://doi.org/10.1534/g3.120.401288.
DOI: https://doi.org/10.1534/g3.120.401288

Interpretive Summary: G-quadruplexes (G4s) are four-stranded nucleic acid structures with closely spaced guanine bases forming square planar shapes in genomes. They are demonstrated experimentally to play a role in gene regulation. Our study shows for the first time the prevalence and possible functional roles of G4s in wheat. In this article, we look at previous studies in human, Arabidopsis, maize, rice and sorghum that analyzed G4 distributions. We observe that for wheat G4s were enriched around three regions, two located on the antisense and one on the sense strand at the following positions: 1) the transcription start site (antisense), 2) the first coding domain sequence (antisense), and 3) the start codon (sense). Functional enrichment analysis revealed that the gene models containing G4 motifs within these peaks were associated with specific gene ontology (GO) terms, such as developmental process, localization and cellular component organization, or biogenesis. Moreover, comparison with other plants showed that monocots share a similar distribution of G4s, but Arabidopsis shows a unique G4 distribution.

Technical Abstract: G-quadruplexes are nucleic acid secondary structures formed by a stack of square planar G-quartets. G-quadruplexes were implicated in many biological functions including telomere maintenance, replication, transcription, and translation, in many species including humans and plants. For wheat, however, though it is one of the world’s most important staple food, no G-quadruplex studies have not been reported to date. Here, we identify putative G4 structures (G4s) in wheat genome for the first time and compare its distribution across the genome against five other mammalian and plant genomes (human, maize, Arabidopsis, rice and sorghum) using computational methods. We identified close to 1 million G4 motifs with a density of 76 G4s/Mb over whole genome and 93 G4s/Mb over genic regions. Remarkably, G4s were enriched around three regions, two located on the antisense and one on the sense strand at the following positions: 1) the transcription start site (TSS) (antisense), 2) the first coding domain sequence (CDS) (antisense), and 3) the start codon (sense). Functional enrichment analysis revealed that the gene models containing G4 motifs within these peaks were associated to specific gene ontology (GO) terms, such as developmental process, localization and cellular component organization, or biogenesis. Moreover, comparison with other plants showed that monocots share a similar distribution of G4s, but Arabidopsis shows a unique G4 distribution. Our study shows for the first time the prevalence and possible functional roles of G4s in wheat.