Publication : USDA ARS

ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #412048

Research Project: Championing Improvement of Sorghum and Other Agriculturally Important Species through Data Stewardship and Functional Dissection of Complex Traits

Location: Plant, Soil and Nutrition Research

Title: Leveraging Biomedicine Resources to Understand Single-Cell Data in Agriculture: The Faang Experience

Author

	KAPOOR, MUSKAN - Iowa State University
	VENTURA, ENRIQUE SAPENA - Embl-Ebi
	YORDANOVA, GALABINA - Embl-Ebi
	GEORGE, NANCY - Embl-Ebi
	Ware, Doreen
	KUMARI, SUNITA - Cold Spring Harbor Laboratory
	TICKLE, TIMOTHY - University Of Missouri
	ELSIK, CHRISTINE - University Of Missouri
	WALSH, AMY - University Of Missouri
	TUGGLE, CHRISTOPHER - Iowa State University
	BURDETT, TONY - Embl-Ebi
	HARRION, PETER - Embl-Ebi
	PAPATHEODOROU, IRENE - Embl-Ebi

Submitted to: Meeting Abstract
Publication Type: Abstract Only
Publication Acceptance Date: 1/12/2024
Publication Date: N/A
Citation: N/A

Interpretive Summary:

Technical Abstract: The agriculture genomics community has numerous data submission standards available but little experience describing and storing single-cell (e.g., scRNAseq) data. Other single-cell genomics infrastructure efforts, such as the Human Cell Atlas Data Coordination Platform (HCA DCP), have resources that could benefit our community. For example, the HCA DCP is integrated with Terra, a cloud native workbench for computational biology developed by Broad, Verily, and Microsoft that houses tools for scGenomics analysis. We described a pilot-scale project that determines the current metadata standards for livestock and crops are used to ingest scRNAseq datasets in a manner consistent with HCA DCP standards and the established resources (e.g., Terra) can be used to analyze the ingested data. Currently, the most comprehensive data ingestion portal for high throughput sequencing datasets from plants, fungi, protists, and animals/humans is Annotare (located at EMBL-European Bioinformatics Institute), ensures that sufficient metadata are collected to enable re-analysis and dissemination via the Single Cell Expression Atlas (SCEA). Annotare supports user-directed annotation and processing of their data, as well as search tools via the SCEA and transferred to the Galaxy analysis space. For animal datasets, another EMBL-EBI portal, the FAANG portal, has been developed that provides bulk and scRNAseq data access. scRNAseq data/metadata can be submitted to FAANG using a semi-automated process. We have extended this tool for scRNAseq data so that files can be validated and ingested to using the HCA DCP metadata and data ingestion service and transferred to Terra for further analysis. Once incorporated, datasets will augment the DCP resource for the scientific community. In an extension of these efforts, we tested and developed prototype tools to visualize the output of scRNAseq analyses on genome browsers and comparing across tissues and cell populations through the platform called Jbrowse. JBrowse now features distinct tracks, showcasing PBMC scRNA-seq alongside two bulk RNA-seq experiments. Notably, the scRNA-seq track shows a bar chart for each gene that illustrates gene expression levels across different cell types. Additionally, we have also created a Shiny-based web application, called Shiny-PIGGI, for the single cell-level transcriptomic study of pig immune tissues and peripheral blood mononuclear cells, which will be an important resource for improved annotation of porcine immune genes and cell types. The Shiny-PIGGI (https://shinypiggi.ansci.iastate.edu) is implemented completely in R, runs on any modern web browser, and requires no programming. This tool thus increases accessibility through eliminating technical training requirements for using Seurat object and related R packages commonly used in scRNAseq analysis. Our main goal was to develop interactive web applications that allows users such as animal scientists and immunologists to visualize and analyze biological datasets. We intend to further build upon these existing tools to construct a scientist-friendly data resource and analytical ecosystem to facilitate single cell-level genomic analysis through data ingestion, storage, retrieval, re-use, visualization, and comparative annotation across agricultural species.

U.S. DEPARTMENT OF AGRICULTURE

Plant, Soil and Nutrition Research: Ithaca, NY