Skip to main content
ARS Home » Midwest Area » Columbia, Missouri » Plant Genetics Research » Research » Research Project #435034

Research Project: Soybean Seed Improvement Through Translational Genomics, Assessments of Elemental Carbon Metabolism, and Lipid Profiles

Location: Plant Genetics Research

2023 Annual Report


Objectives
Objective 1: Develop novel analytical methods to understand the dynamics that underpin lipid metabolism to guide metabolic engineering efforts for lipid production in seeds. Objective 2: Assess central carbon metabolism in altered plant tissues and develop strategies that can be used to assess plant metabolic changes for improving agriculturally relevant seed composition traits or yield. Objective 3: Develop and make available new approaches to evaluate gene functions in gene networks and verify these tools by examining previously identified gene networks in soybean. Objective 4: Discover, characterize, and make available genes for industry-relevant protein and oil traits from new and existing genetic populations created through various methods, such as fast neutrons, conventional crossing, reverse genetics (TILLING), or mining exotic diversity contained in the USDA National Plant Germplasm System.


Approach
Goal 1.1: Quantify major acyl-Acyl Carrier Protein (ACP) species of fatty acid biosynthesis in soybeans. We will develop biochemical methods with mass spectrometry to rigorously quantify acyl-ACPs. Acyl ACPs connect central metabolism with lipid metabolism and will provide an indication of when acyl-ACP synthesis may be bottlenecking the production of lipids under different circumstances which will be further considered through isotopic labeling and measurement of labeled acyl-ACPs. Goal 1.2: Quantify labeling in phospholipid and neutral lipid pools. We will isotopically label seeds and investigate the labeling in phospholipid and neutral lipid intermediates that we hypothesize are most indicative of specific pathway use for lipid production and that can be informative to engineer increased lipid production in the future. The mass spectrometry methods will involve optimization with high resolution instruments. Goal 2: Analyze labeling in organic and amino acid pools in developing soybeans. We will build a platform to transiently label seeds with 13C over short durations (minutes to hours) to investigate the allocation of carbon during specific aspects of seed development. These stages of development contribute to the final composition and are therefore important in establishing the final composition. Methods to rigorously analyze important intermediates including amino acids and organic acids will include fragment evaluation with direct injection mass spectrometry and validation with standards prior to quantification of differences in seeds of different ages. Goal 3: Demonstrate that expression QTL genetic mapping is an effective approach to evaluate regulatory functions of genes in a co-expression network. A eQTL mapping analysis will be conducted with seed transcriptome sequencing and genome sequencing data of the wild and cultivated soybean genotypes to identify the trans-acting eQTL, reveal the relationship of candidate regulatory genes/alleles and their associated genes and evaluate each regulatory relationship (edge) to generate a consensus soybean seed gene regulatory network. A set of CRISPR/Cas9 genome editing vectors for a regulatory gene (hub) will be constructed to alter its regulatory function in “transgenic” soybean for validation of its regulatory functions in the network. Goal 4: Establish that integration of structural and functional genomic analysis of genetic soybean diversity with QTL studies is an effective approach to discovering seed quality genes and alleles. Big data analysis methodologies and data mining strategies will be developed to integrate QTL mapping data, transcriptome and genome sequencing data, soybean seed gene regulatory networks with seed storage reserves and metabolic pathways to identify putative genes/alleles that cause the variation in oil and/or protein content in soybean. We will sequence transcriptomes of soybean seeds containing different alleles of a putative gene to validate regulatory function and provide insight into regulation of oil and/or protein production in seeds.


Progress Report
This is the final report covering the five-year life of this project which terminated in July 2023. It has been replaced by 5070-21000-045-000D, “Improving Soybean Seed Composition, Plant Productivity, and Resilience to Climate Change Through Biological Network Modification”. In support of Objective 1, analytical methods were developed to enable studies of fatty acid biosynthesis in plants and applied to soybeans and other oilseeds to understand better the biochemical production of oil in seeds. Acyl-acyl carrier protein is integral to this process as a scaffold protein that enables fatty acid biosynthesis. Methods were tested and refined to sensitively measure the levels of acyl-acyl carrier proteins in soybeans and other oilseeds that have been modified to produce higher oil content. The seeds were investigated over multiple stages in development to establish whether the acyl-acyl carrier protein levels and thus fatty acid biosynthesis correlated with levels of accumulated seed lipids. Lipids in oilseeds, including soybean partially turnover during the later stages of seed development reducing the value of the seed. Therefore, examining the production of fatty acids quantitatively can help understand if production of lipids drops off late in development or alternatively high rates of lipid turnover are responsible for changes in lipid level. Methods and results from Objective 1 are important for improving seed composition and can enhance the value of soybean and other oilseeds as raw material sources for biofuels. Research related to Objective 2, over the course of the project involved measuring the metabolites that lead to the production of value-added components in seeds like storage protein and oil. Early in the project, effort was focused on developing a method to accurately measure protein levels in seeds. Though methods exist to measure protein, most involve correlations to standards that become inaccurate when seed composition changes dramatically and are not adequate for measurements of composition when seeds are developing. A rigorous method for protein measurement based on quantification of the protein-hydrolyzed amino acids with liquid chromatography mass spectrometry was developed and used to measure protein over seed development. Later in the project methods to identify and more sensitively quantify intermediates from multiple biochemical pathways within plant cells were developed. Seeds that were isotopically labeled with stable isotopes at different times during development were assayed with methods to assess the consequences of changes in metabolism on the final seed composition. Protein and lipid levels in seeds are negatively associated at maturity across many genetic lines; however, the production of oil and protein during development is concordant and this work demonstrated that protein and oil can be increased by focusing on different metabolic events over the duration of seed filling. In support of Objective 3 and 4, a suite of bioinformatic pipelines were established and used to search public databases for soybean transcriptome and genome sequencing entries. Downloaded entries were checked for quality control and included whole genome sequencing data of 12,000 accessions, and 8,000 transcriptome sequencing representing 2,800 distinct biological treatments. The data were used to identify and develop a comprehensive annotation of DNA single nucleotide polymorphism (SNP) and structural variation (SV) in the 12,000 accessions and gene expression in 2,800 distinct tissues. By integrating the genomic and transcriptome sequencing data with a set of computational data analyses and mining strategies, scientists established a ‘big data’ driven technology platform. The platform allowed effective genotyping for genome regions, identification of sequence variation within genes in the 12,000 accessions, determination of expression patterns for genes in 2,800 distinct tissues, identification of the DNA sequence variations associated with traits and gene expression through genome-wide association study, and inferred gene expression and biological process networks. A set of genes (candidates) were identified that potentially regulate seed quality and yield traits including seed protein, oil, amino acid, fatty acid and carbohydrates and weight. Soybean lines containing altered activities of the genes were generated through gene editing and over-expression, resulting in the identification of a set of germplasm lines and “transgenic” plants containing low levels of trypsin inhibitors, high protein, high oil and both higher protein and oil in combination.


Accomplishments
1. Development of soybean lines with increased lipid content for biofuels by manipulating an important biochemical step, malic enzyme, in central metabolism. Soybean value is established by the levels of protein and oil in the seed. The levels of these reserves are established through cellular metabolic activities of important enzymes. ARS researchers in St. Louis, Missouri, augmented soybean metabolism to increase lipid content through expression of malic enzyme that scientists have deduced provides precursors for fatty acid production. The malic enzyme was targeted to two subcellular locations, one of which resulted in increased lipid content. This work will increase oil content in soybeans which is valuable to breeders and industry alike, increasing the market for soybeans. The increased oil can be used for biofuels, thus this work benefits efforts to sustainably produce renewable biofuels by supplanting petroleum-derived fuels.

2. Development of software tools to enable quantification of isotope labelled lipid metabolism studies to improve soybean oil production. Lipid metabolism is extremely complex, and methods are needed to assess which biochemical pathways influence production of lipids in seeds. Isotopic tracers, combined with tracking metabolites over time using mass spectrometry, allows the elucidation of the metabolic pathways that contribute to lipid production, but the analysis of this kind of data is complex. ARS researchers in St. Louis, Missouri, developed a software tool and pipeline to quantify the isotope enrichment differences in lipid metabolites over time, facilitating studies of lipid metabolic pathways. The tools will be useful for understanding and quantifying lipid metabolism and enable new strategies to improve lipid production in soybeans for a variety of purposes.

3. A data analysis platform to facilitate soybean genetic selection. With the advance of high throughput technologies, scientists have generated massive amounts of biological data on soybeans that has potential to improve soybean production and profitability. However, the data is under-utilized due to lack of an integrative data analysis platform. ARS scientists in St. Louis, Missouri, consolidated, characterized, and analyzed whole genome sequencing data of 12,000 diverse soybean accessions, 8,000 transcriptome sequences of 2,800 distinct biological tissues available in public repositories establishing a big data driven technology platform for the consolidated data. The platform identified sequence variations for any of 56,000 soybean genes in the accessions and revealed expression patterns in the tissues. The scientists also demonstrated the platforms utility to soybean genetics, breeding and molecular biology research by using it to discover a set of genes regulating seed quality and yield traits. The consolidated and analyzed datasets are available at SoyBase (https://soybase.org) and Ag Data Commons (https://data.nal.usda.gov/). The platform enhances our ability to translate the huge amount of genomic data into biological discovery and crop improvement.

4. Discovery and interrelationships of key genes controlling seed protein, oil and yield for developing new strategies to improve soybean seed quality and yield. Seed protein and oil content and yield account for the economic value of soybean. As seed protein increases, seed oil and yield generally decrease, which poses a great challenge to improving all three traits simultaneously. ARS scientists in St. Louis, Missouri, discovered that a transcription factor gene (POWR1: Protein, oil, weight regulator 1) and a sucrose transport gene (SWEET39), are the genes responsible for controlling soybean protein content. Both genes play a significant role in controlling seed protein, oil, weight, and their correlation. Mutated alleles were associated with substantially increased seed oil content, weight, and yield and decreased protein. The discoveries provide insights into the molecular basis underlying economically important traits and how they are related to each other and will enable new strategies to improve seed quality and yield in soybean.


Review Publications
Koley, S., Chu, K.L., Mukherjee, T., Morley, S.A., Klebanovych, A., Czymmek, K.J., Allen, D.K. 2022. Metabolic synergy in Camelina reproductive tissues for seed development. Science Advances. 8(43). Article eabo7683. https://doi.org/10.1126/sciadv.abo7683.
Rashid, R., Nair, Z.J., Chia, D.M., Chong, K.K., Gassiot, A.C., Morley, S.A., Allen, D.K., Chen, S.L., Chng, S.S., Wenk, M.R., Kline, K.A. 2023. Depleting cationic lipids involved in antimicrobial resistance drives adaptive lipid remodeling in Enterococcus faecalis. mBio. 14(1). https://doi.org/10.1128/mbio.03073-22.
Diers, B., Specht, J., Graef, G., Song, Q., Rainey, K.M., Ramasubramanian, V., Liu, X., Myers, C., Stupar, R., An, Y., Beavis, W. 2023. Genetic architecture of protein and oil content in soybean seed and meal. The Plant Genome. 16(1). Article e20308. https://doi.org/10.1002/tpg2.20308.
Morley, S.A., Ma, F., Alazem, M., Frankfater, C., Yi, H., Burch-Smith, T., Clemente, T.E., Veena, V., Nguyen, H., Allen, D.K. 2023. Expression of malic enzyme reveals subcellular carbon partitioning for storage reserve production in soybeans. New Phytologist. 239(5):1834-1851. https://doi.org/10.1111/nph.18835.