Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Research Project #434521

Research Project: MaizeGDB: Enabling Access to Basic, Translational, and Applied Research Information

Location: Corn Insects and Crop Genetics Research

2023 Annual Report


Objectives
Objective 1: Accelerate maize trait analysis, germplasm analysis, genetic studies, and breeding through stewardship of maize genomes, genetic data, genotype data, and phenotype data. Objective 2: Develop an infrastructure to curate, integrate, query, and visualize the genetic, genomic, and phenotypic relationships in maize germplasm. Objective 3: Identify and curate key datasets for benchmarking genomic discovery tools for the functional annotation of maize genomes, for agronomic trait analyses, for breeding (including genome editing), and for improving database interoperability. Objective 4: Provide community support services, training and documentation, meeting coordination, support for community elections and surveys, and support for the crop genome database community. Objective 5: Collaborate with database developers and plant researchers to develop improved methods and mechanisms for open, standardized data and knowledge exchange to enhance database utility and interoperability.


Approach
The Maize Genetics and Genomics Database (MaizeGDB – http://www.maizegdb.org) is the model organism database for maize. MaizeGDB’s overall aim is to provide long-term storage, support, and stability to the maize research community’s data and to provide informatics services for access, integration, visualization, and knowledge discovery. The MaizeGDB website, database, and underlying resources allow plant researchers to understand basic plant biology, make genetic enhancement, facilitate breeding efforts, and translate those findings into products that increase crop quality and production. To accelerate research and breeding progress, generated data must be made freely and easily accessible. Curation of high-quality and high-impact datasets has been the foundation of the MaizeGDB project since its inception over 25 years ago. MaizeGDB serves as a two-way conduit for getting maize research data to and from our stakeholders. The maize research community uses data at MaizeGDB to facilitate their research, and in return, their published data gets curated at MaizeGDB. The information and data provided at MaizeGDB and facilitated through outreach has directly been used in research that has had broad commercial, social, and academic impacts. The MaizeGDB team will make accessible high-quality, actively curated and reliable genetic, genomic, and phenotypic description datasets. At the root of high-quality genome annotation lies well-supported assemblies and annotations. For this reason, we focus our efforts on benefitting researchers by developing a system to ensure long-term stewardship of both a representative reference genome sequence assembly with associated structural and functional annotations as well as additional reference-quality genomes that help represent the diversity of maize. In addition, we will enable researchers to access data in a customized and flexible manner by deploying tools that enable direct interaction with the MaizeGDB database. Continued efforts to engage in education, outreach, and organizational needs of the maize research community will involve the creation and deployment of video and one-on-one tutorials, updating maize Cooperators on developments of interest to the community, and supporting the information technology needs of the Maize Genetics Executive Committee and Annual Maize Genetics Conference Steering Committee.


Progress Report
ARS scientists working on the Maize Genetics and Genomics Database (MaizeGDB) in Ames, Iowa, provide tools and resources that make well-curated maize genetics and genomics datasets useful for investigative research and crop improvement. The objectives of the MaizeGDB team are to provide stewardship to key datasets related to maize genetics, genomics, and breeding, develop robust infrastructure to store, query, integrate, and visualize data, curate high-quality, high-impact datasets, interact with the maize research community to identify needs and priorities, and to work with other research communities and databases to coordinate on data standards and interoperability. In support of Objective 1, MaizeGDB accelerated maize trait analyses, germplasm analyses, genetic studies, and breeding by actively stewarding maize genomes, pan-genomes, genetic data, genotype data, and phenotype data. The project successfully hosted and annotated over 100 reference-quality genome assemblies for maize or closely related species, including five versions of the widely used B73 representative genome. Additionally, MaizeGDB made diverse genotype data available, encompassing thousands of maize lines and individuals, thereby representing the broad diversity of the maize genome. MaizeGDB's pioneering work in hosting a maize pan-genome, accompanied by a suite of pan-genomic tools, highlighted the project's continuous efforts to refine methodologies for curating, hosting, and integrating large sets of genome assemblies. This approach facilitated a deeper understanding of the genetic variation and diversity within the maize genome, enabling breeders and researchers more comprehensive analyses of traits, germplasm, genetic studies, and breeding efforts. To address Objective 2, MaizeGDB developed a robust infrastructure capable of curating, integrating, querying, and visualizing the genetic, genomic, and phenotypic relationships within maize germplasm. MaizeGDB provides over 50 genome browsers, enabling researchers to explore and compare recently released maize genomes. Furthermore, MaizeGDB curated numerous datasets associated with functional regions in the genome, allowing for visual exploration through user-friendly genome browser tools. The project also enhanced various tools for annotating gene structures, visualizing structural variation, storing gene-based features, and facilitating gene expression visualization and comparison, empowering researchers to conduct comprehensive genetic and genomic analyses. For Objective 3, MaizeGDB identified and curated key datasets for benchmarking genomic discovery tools, agronomic trait analyses, breeding, and improving database interoperability. We curated an extensive collection of datasets (numbering over 1,000 to date) associated with functional regions in the genome, offering researchers valuable resources for making genomic discoveries. Furthermore, the project curated datasets supporting mutant maize populations, improved gene structures, and other datasets pertaining to functionally significant regions in the genome. Through collaborative tools co-developed with the University of Missouri, MaizeGDB enabled researchers to seamlessly integrate their own data with publicly available data, simplifying the process of data analysis and exploration. Regarding Objective 4, MaizeGDB provided a range of community support services, including training, documentation, meeting coordination, and support for community elections and surveys. As the community hub for maize research, MaizeGDB facilitated improved communication, collaboration, and knowledge sharing among maize researchers worldwide. MaizeGDB actively supported new initiatives to foster a more diverse and inclusive scientific community, emphasizing equity and creating a welcoming environment for researchers. The project played a pivotal role in coordinating activities, providing technical support, and ensuring the dissemination of valuable information relevant to high-impact research. For Objective 5, MaizeGDB actively collaborated with database developers and plant researchers to develop improved methods and mechanisms for open, standardized data and knowledge exchange, enhancing database utility and interoperability. By engaging with other databases and research communities, MaizeGDB successfully promoted data standards and interoperability between different platforms. These efforts enhance synergy with other plant research communities, promote cross-disciplinary collaborations, enrich our understanding of plant genomics, and refine MaizeGDB's curation practices.Furthermore, MaizeGDB contributed to the development of open and standardized data exchange mechanisms, ultimately enhancing the overall usability and accessibility of the database.


Accomplishments
1. Tools developed to streamline protein structure determination and improve comparisons across various species. Determining protein structures was previously time-consuming and costly, leading to a bottleneck in the field of structural biology. Unfortunately, the absence of a protein structure poses limitations in comparing proteins, which makes it challenging to understand their roles in gene function and impedes efforts to improve traits. ARS scientists in Ames, Iowa, at the Maize Genetics and Genomics Database (MaizeGDB) developed a suite of tools to overcome these obstacles, and published reports of them in a leading genetic journal. These tools employ machine learning for rapid 3-D protein structure prediction, streamlining protein comparisons within maize and across diverse species, from crops to humans and yeast. This paves the way for deeper exploration of maize's genetic diversity, accelerating research, and enhancing our understanding of gene functions, ultimately driving advancements for crucial traits. These advancements offer valuable information and support for maize researchers and breeders, surpassing the limitations faced just a few years ago and contributing to enhanced research outcomes.


Review Publications
Woodhouse, M.H., Portwood II, J.L., Sen, S., Hayford, R.K., Gardiner, J.M., Cannon, E.K., Harper, L.C., Andorf, C.M. 2023. Maize protein structure resources at the maize genetics and genomics database. Genetics. 224(1).Article iyad016. https://doi.org/10.1093/genetics/iyad016.
Mural, R.V., Sun, G., Grzybowski, M., Tross, M.C., Jin, H., Smith, C., Newton, L., Andorf, C.M., Woodhouse, M.H., Thompson, A.M., Sigmon, B., Schnable, J.C. 2022. Association mapping across a multitude of traits collected in diverse environments in maize. Gigascience. 11.Article giac080. https://doi.org/10.1093/gigascience/giac080.
Cho, K., Sen, T.Z., Andorf, C.M. 2022. Predicting tissue-specific mRNA and protein abundance in maize: A machine learning approach. Frontiers in Artificial Intelligence. 5. Article 830170. https://doi.org/10.3389/frai.2022.830170.
Cagirici, B.H., Andorf, C.M., Sen, T.Z. 2022. Co-expression pan-network reveals genes involved in complex traits within maize pan-genome. BMC Plant Biology. 22. Article 595. https://doi.org/10.1186/s12870-022-03985-z.