Location: Plant, Soil and Nutrition Research
Title: Advancing agricultural genomics: Integrating rsIDs for standardized genetic variation and enhanced breeding strategiesAuthor
WEI, SHARON - Cold Spring Harbor Laboratory | |
TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory | |
KUMAR, VIVEK - Cold Spring Harbor Laboratory | |
OLSON, ANDREW - Cold Spring Harbor Laboratory | |
CHOGULE, KAPEEL - Cold Spring Harbor Laboratory | |
KIM, SUYUN - Cold Spring Harbor Laboratory | |
Ware, Doreen |
Submitted to: Cold Spring Harbor Meeting
Publication Type: Abstract Only Publication Acceptance Date: 11/13/2024 Publication Date: N/A Citation: N/A Interpretive Summary: Technical Abstract: Genetic variation is fundamental to agricultural innovation, enabling the development of crops that are resilient, nutritious, and adaptable to changing environments. Understanding and utilizing this variation are crucial for current breeding programs and future advancements. However, the absence of standardized identifiers for genetic variants across plant, animal, and insect accessions poses challenges, including data interoperability, functional annotation, and comparing genetic information across individuals within the same species, hindering meaningful insights. In the biomedical field, Reference SNP cluster IDs (rsIDs) have become the standard for cataloging genetic variants, facilitating seamless data integration and cross-referencing across studies. Agriculture stands to benefit greatly from adopting a similar system. With the NIH ceasing to host non-human variation, the European Variation Archive (EVA) has stepped in, assigning millions of rsIDs to agriculturally significant crops like corn, rice, soybean, and grapevine. This allows genetic variants to be identified by rsIDs, independent of genome assemblies, streamlining data aggregation, improving phenotype prediction, and enhancing trait-based marker discovery. We have integrated rsIDs into USDA SorghumBase and Gramene Pan Genome resources for Sorghum (41M), Rice (27M), Maize (78M), and Grape (0.3M). Our initial focus is on reducing computational challenges associated with the growing number of accessions. We anticipate that many crop species will soon have up to 100 reference assemblies and over 10,000 individuals sequenced at low coverage, making it impractical to call genetic variants on each genome. Mapping rsIDs from a reference genome to pan-genomes offers a more efficient solution. Using EVA's variation mapping pipeline, we achieved 98% mapping accuracy across different assembly versions of the same accession and around 87% across Sorghum pan-genomes. We leverage rsIDs to facilitate comparisons between accessions within species, assign variant effect predictions—such as loss of protein function—and anchor population studies. This initiative not only advances the use of genetic variation in plants but also sets the stage for wider adoption of rsIDs in agriculture. Collaborating with the broader community, we aim to establish rsIDs as a standard in agricultural genomics, enhancing database interoperability and the power of genetic markers in breeding programs. Funding: USDA ARS (8062-21000-051-00D). |