Publication : USDA ARS

ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #411376

Research Project: Championing Improvement of Sorghum and Other Agriculturally Important Species through Data Stewardship and Functional Dissection of Complex Traits

Location: Plant, Soil and Nutrition Research

Title: Standardizing biocuration of genetic variation data to promote FAIRification

Author

	TELLO-RUIZ, MARCELA - Cold Spring Harbor Laboratory
	ALI, KAZIM - University Of Karachi
	Ali, Gul - Shad
	Bassil, Nahla
	BEIER, SEBASTIAN - Ibg-4 Bioinformatics
	Bushakra, Jill
	COBO-SIMON, IRENE - Instituto Nacional De Investigacion Y Technologia Agraria Y Alimentaria
	Ware, Doreen
	WEI, SHARON - Cold Spring Harbor Laboratory
	CEZARD, TIMOTHEE - Embl-Ebi
	DYER, SARAH - Embl-Ebi
	Gutierrez, Osman
	Harrison, Melanie
	HUMANN, JODI - Washington State University
	KUMAR, VIVEK - Cold Spring Harbor Laboratory
	Nelson, Rex
	SALAVATI, MAZDAK - Roslin Institute
	SHEEHAN, MOIRA - Cornell University

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 1/12/2024
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: The Standards for Genetic Variation Data Working Group of the AgBioData Consortium brings together a community of biocurators, data providers, bioinformaticians, and computer scientists engaged in agricultural research. Late this year, the Public Genetic Resources Working Group merged with our group. Our working group’s primary tasks have evolved into the harmonization and adoption of standards for genotypic and phenotypic variation data across diverse platforms in the plant and animal kingdoms. Additionally, the group aims to promote interoperability and facilitate access to these datasets for researchers and breeders. Thanks to the FAANG (Functional Annotation of ANimal Genomes) project, there has been considerable progress in the adoption and dissemination of metadata standards for animal genetic variants. In plants, the first guidelines for findable, accessible, interoperable, and reusable (FAIR) handling of genetic variants were published in 2022. This involved direct collaboration with EMBL-EBI, one of the International Nucleotide Sequence Database Collaboration (INSDC) pillars, to support data submission to BioSamples and the European Variation Archive (EVA) global repository. A preliminary checklist was provided to classify and validate data and metadata, making significant progress in enhancing data availability. The Standards for Genetic Variation Working Group has broadened such guidelines with recommendations to crosslink sample identifiers with agricultural resources, specifically germplasm repositories like USDA-ARS GRIN (Germplasm Resources Information Network)-Global. The group also suggests including synonyms for common sample names, and include traceable population panel associations. We surveyed the AgBioData community, namely species-specific and clade-wide databases, germplasm repositories, as well as independent data producers. The goal was to gather information on existing and anticipated genetic variation data sets to facilitate adoption of standards, and promote interoperability between resources. In addition, we identified new challenges, such as the lack of reference genome assemblies in an INSDC repository or genetic variation not publicly available in standard form (e.g., VCF file), and discussed potential solutions and sustainability workflows. This includes adapting and further developing tools used to address similar problems encountered previously with human data. We will showcase how such challenges are being addressed. Progress towards the above objectives, along with the crucial need for training data generators submitting data to public repositories, is critical to make genetic variation data more FAIR for agroscience. Future plants look to link variation data sets with phenotypic data to support association studies and advancing breeding approaches.

U.S. DEPARTMENT OF AGRICULTURE

Plant, Soil and Nutrition Research: Ithaca, NY