Skip to main content
ARS Home » Midwest Area » Ames, Iowa » Corn Insects and Crop Genetics Research » Research » Publications at this Location » Publication #415259

Research Project: MaizeGDB - Database and Computational Resources for Maize Genetics, Genomics, and Breeding Research

Location: Corn Insects and Crop Genetics Research

Title: A unified VCF dataset from nearly 1,500 diverse maize accessions and resources to explore the genomic landscape of maize

Author
item Andorf, Carson
item ROSS-IBARRA, JEFFREY - University Of California, Davis
item SEETHARAM, ARUN - Iowa State University
item HUFFORD, MATTHEW - Iowa State University
item Woodhouse, Margaret

Submitted to: bioRxiv
Publication Type: Pre-print Publication
Publication Acceptance Date: 5/5/2024
Publication Date: 5/5/2024
Citation: Andorf, C.M., Ross-Ibarra, J., Seetharam, A., Hufford, M., Woodhouse, M.H. 2024. A unified VCF dataset from nearly 1,500 diverse maize accessions and resources to explore the genomic landscape of maize. bioRxiv. https://doi.org/10.1101/2024.04.30.591904.
DOI: https://doi.org/10.1101/2024.04.30.591904

Interpretive Summary: In the field of maize genetics, understanding the diversity within the DNA of different maize varieties is crucial for crop improvement. However, comparing this diversity has been challenging due to differences in how data is collected and analyzed. This has hindered progress in identifying useful traits for breeding better maize varieties. To address this issue, the Maize Genetics and Genomics Database (MaizeGDB) teamed up with maize researchers to create a standardized method for analyzing genetic variations in maize. They focused on nearly 1,500 different maize varieties, including inbred lines, traditional varieties, and teosintes. By using a consistent approach and the latest B73 reference genome version, they were able to generate reliable data on genetic variations. To make accessing the data easier, a user-friendly web tool was created. This tool allows users to filter, visualize, and download genotype information in two ways. First, users can query the data based on specific genomic regions of interest, enabling targeted exploration of the genome. Second, users can select subsets of maize varieties, focusing on the varieties that meet their specific needs. These resources will allow scientists and breeders to easily access and compare genetic data, speeding up the process of identifying valuable traits for maize breeding.

Technical Abstract: Efforts to capture and analyze maize nucleotide diversity have ranged widely in scope, but differences in reference genome version and software algorithms used in these efforts inhibit comparison. To address these continuity issues, The Maize Genetics and Genomics Database has collaborated with researchers in the maize community to offer variant data from a diverse set of 1,498 inbred lines, traditional varieties, and teosintes lines through a standardized variant-calling pipeline against version 5 of the B73 reference genome. The output was filtered for mapping quality, coverage, and linkage disequilibrium, and annotated based on variant effects relative to the B73 RefGen_v5 gene annotations. MaizeGDB has also updated a web tool to filter, visualize, and download genotype sets based on genomic locations and accessions of interest. MaizeGDB plans to host regular updates of these resources as additional resequencing data become available, with plans to expand to all publicly available sequence data.