Location: Genomics and Bioinformatics Research
Project Number: 6066-21310-006-013-S
Project Type: Non-Assistance Cooperative Agreement
Start Date: Sep 23, 2020
End Date: Sep 22, 2025
Objective:
The Institute for Genomics, Biocomputing & Biotechnology (IGBB) at Mississippi State University (MS State) and the ARS’s Genomics & Bioinformatics Research Unit (GBRU) will explore the plant family Malvaceae (mallows) using genomics (including DNA sequencing, transcriptomics, proteomics, and metabolomics) techniques. Malvaceae is a species-rich clade that includes such important agronomic species as cotton, cacao, okra, and jute, and numerous ornamentals including lindens, mallows, hollyhocks, and hibiscuses. The proposed work leverages and builds upon previous Malvaceae research by the IGBB and GBRU. Goals of the current project include (but are not limited to) [a] generation of de novo genome sequences for all three species in the genus Kokia, and detailed DNA-based exploration of diversity in this endangered cotton-related clade; [b] genome sequencing of balsa wood (Ochroma pyramidale) and kola nut (Cola acuminata); [c] study of American basswood (Tilia americana), the only linden species native to North America, and comparison of its genome, transcriptome, and proteome with the European small-leaved linden (T. cordata) and large-leaved linden (T. platyphyllos); [d] comparative genomic analysis of cocoa (Theobroma) species; and [e] study of the microbiomes and pathogens/pests associated with select Malvaceae species. All the research will utilize the IGBB’s expertise in computational biology and bioinformatics. The research, which will be conducted in collaboration with several ARS groups and academic scientists, will provide genomic information that can be integrated into breeding programs, used in plant improvement, and leveraged to illuminate evolutionary relationships within and among species.
Approach:
Pacific Bioscience (PacBio, long-read) and/or Oxford Nanopore Technologies (ONT, long-read) sequencing technologies will be used for genome sequencing. Illumina (short-read) sequencing will be used for transcriptome sequencing, genome resequencing, genotyping via double-digest restriction-site associated DNA sequencing (ddRADseq), and Hi-C sequencing. RT-qPCR will be used to validate RNASeq-based differential gene expression results. Proteomics and metabolomics data will be generated using the IGBB’s LTQ Orbitrap Velos mass spectrometer. The IGBB will utilize its supercomputing capacity and expertise to conduct genome assembly, SNP calling, and comparative genomics research.