Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #418970

Research Project: Championing Improvement of Sorghum and Other Agriculturally Important Species through Data Stewardship and Functional Dissection of Complex Traits

Location: Plant, Soil and Nutrition Research

Title: Advancing agricultural genomics: Integrating rsIDs for standardized genetic variation and enhanced breeding strategies

Author
item WEI, SHARON - Cold Spring Harbor Laboratory
item TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory
item KUMAR, VIVEK - Cold Spring Harbor Laboratory
item OLSON, ANDREW - Cold Spring Harbor Laboratory
item CHOGULE, KAPEEL - Cold Spring Harbor Laboratory
item KIM, SUYUN - Cold Spring Harbor Laboratory
item Ware, Doreen

Submitted to: Cold Spring Harbor Meeting
Publication Type: Abstract Only
Publication Acceptance Date: 11/13/2024
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: Genetic variation is fundamental to agricultural innovation, enabling the development of crops that are resilient, nutritious, and adaptable to changing environments. Understanding and utilizing this variation are crucial for current breeding programs and future advancements. However, the absence of standardized identifiers for genetic variants across plant, animal, and insect accessions poses challenges, including data interoperability, functional annotation, and comparing genetic information across individuals within the same species, hindering meaningful insights. In the biomedical field, Reference SNP cluster IDs (rsIDs) have become the standard for cataloging genetic variants, facilitating seamless data integration and cross-referencing across studies. Agriculture stands to benefit greatly from adopting a similar system. With the NIH ceasing to host non-human variation, the European Variation Archive (EVA) has stepped in, assigning millions of rsIDs to agriculturally significant crops like corn, rice, soybean, and grapevine. This allows genetic variants to be identified by rsIDs, independent of genome assemblies, streamlining data aggregation, improving phenotype prediction, and enhancing trait-based marker discovery. We have integrated rsIDs into USDA SorghumBase and Gramene Pan Genome resources for Sorghum (41M), Rice (27M), Maize (78M), and Grape (0.3M). Our initial focus is on reducing computational challenges associated with the growing number of accessions. We anticipate that many crop species will soon have up to 100 reference assemblies and over 10,000 individuals sequenced at low coverage, making it impractical to call genetic variants on each genome. Mapping rsIDs from a reference genome to pan-genomes offers a more efficient solution. Using EVA's variation mapping pipeline, we achieved 98% mapping accuracy across different assembly versions of the same accession and around 87% across Sorghum pan-genomes. We leverage rsIDs to facilitate comparisons between accessions within species, assign variant effect predictions—such as loss of protein function—and anchor population studies. This initiative not only advances the use of genetic variation in plants but also sets the stage for wider adoption of rsIDs in agriculture. Collaborating with the broader community, we aim to establish rsIDs as a standard in agricultural genomics, enhancing database interoperability and the power of genetic markers in breeding programs. Funding: USDA ARS (8062-21000-051-00D).