Location: Plant Genetics Research
2018 Annual Report
Objectives
Objective 1: Develop and make available new approaches to evaluate gene functions in gene networks and verify these tools by examining previously identified gene networks in soybean.
Objective 2: Discover, characterize, and make available genes for industry-relevant protein and oil traits from new and existing genetic populations created through various methods, such as fast neutrons, conventional crossing, reverse genetics (TILLING), or mining exotic diversity contained in the USDA National Plant Germplasm System.
Approach
We will apply a genome-wide reverse engineering approach to reconstruct a gene regulatory network in soybean using in-house generated and public available transcriptome sequencing data. An eQTL mapping analysis will conducted with seed transcriptome sequencing and genome sequencing data of the wild and cultivated soybean genotypes to identify the trans-acting eQTL and reveal the relationship of candidate regulatory genes/alleles and their associated genes. The reconstructed gene regulatory network, regulatory relationships generated from eQTL analysis and the co-expression gene network that we previously modeled will be compared to evaluate each regulatory relationship (edge) to generate a consensus soybean seed gene regulatory network. A set of CRISPR/Cas9 genome editing vectors for a regulatory gene (hub) will be constructed to alter its regulatory function in “transgenic” soybean for validation of its regulatory functions in the network.
In addition, a set of big data analysis methodologies and data mining strategies will be developed to integrate the large amount of publically available and in-house generated QTL mapping data, transcriptome and genome sequencing data, soybean seed gene regulatory networks predicted above and seed storage reserve related metabolic pathways to identify putative genes/alleles that cause the variation in oil and/or protein content in soybean. We will sequence transcriptomes of soybean seeds containing different alleles of a putative gene to determine their transcriptome response to the allelic variation for validating its regulatory function and providing an insight into its underlying mode of action in regulating oil and/or protein production in seeds.
Progress Report
Searched public databases for soybean transcriptome and genome sequencing entries available to download. Identified a total of 2801 soybean transcriptome sequencing entries and 2550 whole genome sequencing entries from diverse research experiments. Organizing and sorting through the data to download biologically relevant sequencing data, and removing the redundant entries. Meanwhile, also downloading the well-annotated transcriptome and genome sequencing data. So far, generated a collection of transcriptome sequencing data from 946 biological samples and whole genome sequencing data from 1519 soybean accessions, which was generated by ARS laboratory and our collaborators, or downloaded from the public databases.
Accomplishments