Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Soybean Genomics & Improvement Laboratory » Research » Research Project #434471

Research Project: Characterization of Genetic Diversity in Soybean and Common Bean, and Its Application toward Improving Crop Traits and Sustainable Production

Location: Soybean Genomics & Improvement Laboratory

2021 Annual Report


Objectives
Objective 1: Discover QTL and genes controlling biotic and abiotic stress tolerance, and agronomic and quality traits in soybean and common bean and develop new DNA markers that define haplotype variation across new and previously identified genomic regions. [NP301, C1, PS1A; C3, PS3B] The aim of objective 1 is to develop community resources for efficient identification of genes/QTL impacting a range of traits and to facilitate marker assisted selection of alleles in soybean and common bean in collaboration with breeders. These include highly polymorphic markers, core germplasm collection and genotypic datasets of new exotic elite germplasm introduced to USDA Soybean Germplasm Collection. Objective 2: Evaluate diverse soybean populations developed from hybridization with wild soybean to discover unique QTL controlling seed protein and oil content, develop molecular markers, and make these available to breeders for improving soybean quality. [NP301, C1, PS1A; C3, PS3B] As many wild soybean germplasm may has different alleles controlling high protein and oil content than cultivated soybean, here we will explore wild soybean for the improvement of U.S. soybean seed protein and oil content with the markers developed from Objective 1 and genomic tools previously developed in our laboratory. Objective 3: Characterize genetic diversity of the Soybean Rhizobium Germplasm Collection using whole genome sequencing, evaluate nitrogen fixation efficiency of the core strains, and use the information to identify rhizobium genes associated with host-specific nodulation and nitrogen fixation in specific soybean genotype/rhizobium symbioses. [NP301, C1, PS1A; C3, PS3B] Genetic diversity of the rhizobia will be evaluated using genomic information and their influence on the nitrogen fixation efficiency in soybean will be analyzed. The research will result in the identification of efficient strains and genes for enhanced nitrogen fixation in soybean, resulting in better utilization of the diversity of rhizobium strains and soybean ancestors to improve biological nitrogen fixation in commercial soybean cultivars.


Approach
Objective 1: Solexa short genomic DNA sequences from 16 diverse genotypes of different common bean market classes will be aligned to the common bean whole genome sequence (WGS) for SSR marker discovery. After filtering, primer sets will be designed to amplify the SSRs. A subset of 100 primer pairs will be randomly selected for testing polymorphism using genomic DNA from the 16 diverse common bean genotypes. A total of 12 pairs of diverse genotypes from different market classes of the Andean Diverse Panel of common bean will be sequenced. Called SNPs will be filtered based on a number of factors for beadchip assay. SNPs that are polymorphic within multi- market classes will be added to the Illumina Infinium BARCBean6K_3 BeadChip pool or used for KASP markers to fine map gene/QTL in targeted genomic regions. Based on the SNP data of the >18,000 cultivated soybean accessions assayed with SoySNP50K BeadChip, core sets of soybean accessions for each soybean maturity group will be created. The software Core Hunter 3 will be used to select the core collection with high allelic richness. Objective 2: a nested association mapping panel consisting of 150-300 F6 lines from each of 10 crosses of NC-Raleigh x wild soybean from the wild soybean core collection will be developed. The parents and the RILs will be grown in the field at two locations in two years. DNA isolated from the RILs and parents will be genotyped with Illumina BARCSoySNP6K BeadChips. Protein content and oil content of the parents and lines will be measured using a DA 7250 NIR Analyzer. The dataset will be used to identify QTL, genes and haplotypes controlling high seed protein and oil content in wild soybean that will be used for improving cultivated soybean and to predict accuracy of genomic selection. Objective 3: Genomic DNA of 760 soybean Bradyrhizobium strains will be isolated and sequenced at using NextSeq500 Sequencer. The resulting sequence will be aligned to the WGS of the B. japonicum strain USDA110 for variant discovery. Redundant or highly similar strains with 99.9% similarity among the soybean rhizobia will be identified. Within each cluster with 99.9% similarity, an accession from each cluster will be evaluated for nitrogen fixation efficiency using 8 ancestral cultivars which contribute more than 70% of the genetic diversity to the Southern and Northern American elite cultivars. Plant will be measured for chlorophyll content and biomass with or without inoculation of the stains, and scored for plant vegetative growth based on the growth of the plant inoculated with USDA110, a recommended soybean strain. The test in eight ancestors will be carried out in a greenhouse with replications.


Progress Report
Progress was made to sequence diverse Andean common bean germplasm and analyze the sequences under Objective 1. A total of 52 diverse Andean accessions from different market classes (beige, dark red kidney, small red, yellow, red mottled, brown, purple mottled, cranberry, light red kidney, white, purple speckled, and small red) of the Andean Diverse Panel (Cichy K., et al. 2015. Crop Science 55: 2149-2160) were selected for sequencing. The Andean beans selected within each market class could capture the largest genetic divergence based on the analysis of the SNPs genotyped with the BARCBean6K_3 BeadChips. DNA from these common bean accessions was digested with fragmentase followed by size selection of 300-500 bp fragments and ligation of adapters for indexing. A 150 bp paired-end sequence analysis was run on the Illumina NextSeq 500. We obtained an average of 15 gigabases or 25x genome coverage per genotype. After aligning the reads to the common bean WGS v2.0 using Burrows-Wheeler Aligner (BWA), a total of 6 million SNPs and 1 million indels among the 52 accessions were identified using Samtools software. From these, we selected a set of 6,000 SNPs that had a high minor allele frequency (>0.20) among 52 Andean bean accessions and evenly distanced in the euchromatic regions and heterochromatic regions respectively. These SNPs were added to the existing BARCBeanSNP6K_3 beadpool containing 6,000 SNPs. Thus, the total number of SNPs reached 12,000 in the new beadpool. Analyses of the SNP assay in different Andean populations in collaboration with the USDA-ARS scientists at East Lansing, Michigan, Prosser, Washington, and Beltsville, Maryland, showed that the SNPs were highly polymorphic in both Andean and MesoAmerican populations. The common bean assay BARCBean12K was commercialized by Illumina Inc and is being used by common bean researchers at research institutes, universities, and seed companies in the U.S., Brazil, and Netherland to discover genes controlling different traits in mapping populations, assist common bean breeding selection and distinguish varieties in both Andean and MesoAmerican bean populations. Progress was made to field test populations from the crosses between cultivated soybean and wild soybean to discover quantitative trait loci (QTL) controlling seed protein and oil content in wild soybean and to genotype the populations under Objective 2. Using the single seed descent method, a total of 10 G. max x G. soja families (NC-Raleigh × PI 549032, NC-Raleigh × PI378684B, NC-Raleigh × PI378690, NC-Raleigh × PI378696B, NC-Raleigh × PI407020, NC-Raleigh × PI407228, NC-Raleigh × PI424007, NC-Raleigh × PI424045, NC-Raleigh × PI424083A, and NC-Raleigh × PI562551) with a common G. max NC-Raleigh parent were developed. A total of >1000 recombinant inbred lines (RILs) derived from the ten G. max x G. soja crosses were grown at North Carolina for one replication and Beltsville, Maryland, for two replications. Protein content and oil content for >3,000 plots were obtained in collaboration with researchers at the University of Georgia. In addition, genotypic data from all the lines and parents were assayed with BARCSoySNP6K, and genomic variants from the parents were obtained via whole-genome sequencing analysis. For the purpose of accurately imputing the dataset and for fine-mapping QTL controlling protein and other traits, the parameters in the three commonly used imputation computer programs (Impute 5.0, Beagle 5.0, and AlphaPlantimpute) were optimized. The effects of population factors, such as marker density of individual RIL, the extent of linkage disequilibrium, minor allele frequency, and genetic map distance vs. physical distance, on imputation accuracy were also explored. Preliminary testing of the pipeline and methods on a Nested Association Mapping (NAM ) population demonstrated that the imputed dataset could significantly reduce the interval of QTL region compared to the un-imputed dataset. Progress was made in the characterization of genetic diversity of the USDA Rhizobium Germplasm Collection using whole-genome sequencing under Objective 3. About 500 soybean Bradyrhizobium strains have been grown and isolated from cultured cells, genomic DNA for the 500 strains has been extracted. The DNA of these isolates will be sequenced once the DNA from other isolates is completed. Progress was made in the discovery of genes or QTL controlling disease resistance, agronomic and seed composition traits in soybean and common bean. The molecular markers and assays such as the SoySNP50K, BARCSoySNP6K for soybean, and the BARCBean6K_3 and BARCBean12K for common bean, which was developed by USDA-ARS scientists at Beltsville, Maryland, were used to analyze soybean and common bean genetic populations created by collaborators across the U.S. and other countries. The analyses resulted in the mapping of genomic regions or genes controlling numerous soybean traits, including aluminum tolerance, seed protein and oil content on chromosome 20, resistance to seed and seedling rot caused by Pythium species, cytoplasmic male sterility, low seed coat deficiency, symbiotic compatibility between soybean and bradyrhizobium strains and seed size, etc., development of markers tagging low concentration of Kunitz trypsin inhibitor in soybean seeds and genomic selection of seed composition in collaboration with researchers in Virginia Tech, University of Missouri, Danforth Plant Science Center at St. Louis, and Universities in China. In common bean, the analyses led to the development of markers associated with bean golden yellow mosaic virus resistance, anthracnose and angular leaf spot disease resistance, post-processing color retention in black bean, and the development of common bean lines with increased cysteine and methionine concentration in collaboration with researchers at USDA-ARS, Prosser, Washington, University of Michigan, Agriculture and Agri-Food Canada, Morden, Manitoba (MB), Canada, and Universities in Brazil.


Accomplishments
1. Creation of a series of efficient tools for soybean genetics and breeding research. Although high-throughput sequencing technology can genotype a large number of breeding materials in plants, the information provided by sequencing is often too detailed for researchers, costing them a large amount of money and time to analyze the data. In many cases, a small set of informative, high-quality markers of a unique pattern is enough to screen breeding lines in early generations, perform genomic prediction, and map genes controlling traits. ARS-USDA scientists at Beltsville, Maryland, developed two assays, BARCSoySNP3K and SoySNP1K, containing 3K and 1K DNA markers, respectively. The 3K was commercialized by Illumina Inc, and the 1K was commercialized by Agriplex Genomics, Cleveland, Ohio. These affordable assays are expected to shorten breeding cycles and accelerate soybean trait mapping and improvement, significantly impacting private and public soybean breeding programs.

2. Identification of the gene controlling a major protein content locus in soybean. Soybean, one of the most important crops globally, was domesticated from wild soybean and has been further improved as a dual-use seed crop to provide highly valuable oil and protein for human consumption and animal feed. Previously, several studies have reported a major locus controlling protein and oil content on Chromosome 15, however, the corresponding gene for the locus is unknown. In this study, USDA-ARS scientists at Beltsville, Maryland, and St. Louis, Missouri, analyzed 631 soybean whole genome sequences and determined that the sucrose transporter gene controls seed protein and oil content as well as seed weight in soybean. This is the first report of the gene responsible for the major locus controlling protein and oil content on chromosome 15. The comprehensive knowledge on the molecular basis controlling the traits on Chromosome 15 is valuable for scientists to design new strategies for soybean seed quality improvement through breeding and biotechnological approaches.

3. Soybean gene restricting nitrogen-fixing identified. Nitrogen is the most critical nutrient requirement for crop production. Legume crops like soybean can derive most of the nitrogen required for optimal growth and yield with nitrogen-fixing bacteria known as rhizobia. Despite knowing how rhizobia establish connections to form symbiotic root nodules in plants, scientists still do not understand why this is possible. USDA-ARS scientists, in collaboration with researchers at Huazhong Agricultural University, China, analyzed the DNA genomes of soybean and found a gene, GmNNL1, responsible for the numbers of root nodules formed by rhizobia. These findings will be helpful to soybean scientists and breeders at government agencies, universities, and private institutes who want to improve soybean cultivation with less fertilizer application.


Review Publications
Rosso, L., Shang, C., Escamilla, D.M., Gillenwater, J., Song, Q., Zhang, B. 2021. Development of breeder-friendly KASP markers for low concentration of kunitz trypsin inhibitor in soybean seeds. International Journal of Molecular Sciences. 22:2675. https://doi.org/10.3390/ijms22052675.
Zhang, H., Goettel, W., Song, Q., Jiang, H., Hu, Z., Wang, M.L., An, Y. 2020. Selection of GmSWEET39 for oil and protein improvement in soybean. PLoS Genetics. 16(11).e1009114. https://doi.org/10.1371/journal.pgen.1009114.
Zhang, B., Wang, M., Sun, Y., Zhao, P., Liu, C., Qing, K., Hu, X., Zhong, Z., Cheng, J., Wang, H., Pemg, Y., Shi, J., Zhuang, L., Du, S., He, M., Wu, H., Liu, M., Chen, S., Wang, H., Chen, X., Fan, W., Tian, K., Wang, Y., Chen, Q., Wang, S., Dong, F., Yang, C., Zhang, M., Song, Q., Li, Y., Wang, X. 2021. Glycine max NNL1 restricts symbiotic compatibility with widely distributed bradyrhizobia via root hair infection. Nature Plants. 7:73-86. https://doi.org/10.1038/s41477-020-00832-7.
Soler-Garzon, A., Oladzadabbasabadi, A., Beaver, J., Beebe, S., Lee, R., Lobaton, J., Macea, E., Mcclean, P., Raatz, B., Rosas, J.C., Song, Q., Miklas, P.N. 2021. NAC candidate gene marker for bgm-1 and interaction with QTL for resistance to Bean golden yellow mosaic virus in common bean. Frontiers in Plant Science. 12. Article 628443. https://doi.org/10.3389/fpls.2021.628443.
Costa, L.C., Nalin, R.S., Dias, M.A., Ferreira, M.E., Song, Q., Pastor Corrales, M.A., Hurtado-Gonzales, O.P., Souza, E.A. 2020. Different loci control resistance to different isolates of the same race of Colletotrichum lindemuthianum in common bean. Theoretical and Applied Genetics. 134:543-556. https://doi.org/10.1007/s00122-020-03713-x.
Clevinger, E., Biyashev, R., Lerch, E., Yu, H., Quigley, C.V., Song, Q., Dorrance, A., Robertson, A., Maroof, S. 2021. Identification of quantitative disease resistance Loci towards four Pythium species in soybean. Frontiers in Plant Science. 12:514. https://doi.org/10.3389/fpls.2021.644746.
Beche, E., Gillman, J.D., Song, Q., Nelson, R.L., Beissinger, T., Decker, J., Shannon, G., Scaboo, A.M. 2021. Genomic prediction using training population design in interspecific soybean populations. Molecular Breeding. 41. Article e15. https://doi.org/10.1007/s11032-021-01203-6.
Valliyodan, B., Brown, A.V., Wang, J., Patil, G., Liu, Y., Otyama, P.I., Nelson, R., Vuong, T., Song, Q., Musket, T.A., Wagner, R., Marri, P., Reddy, S., Sessions, A., Wu, X., Grant, D.M., Bayer, P., Roorkiwal, M., Varshney, R.K., Liu, X., Edwards, D., Xu, D., Joshi, T., Cannon, S.B., Nguyen, H.T. 2020. Genetic variation among 481 diverse soybean accessions, inferred from genomic re-sequencing. Scientific Data. 8. Article 50. https://doi.org/10.1038/s41597-021-00834-w.
Ma, G., Song, Q., Li, X., Qi, L. 2020. High-density mapping and candidate gene analysis of Pl18 and Pl20 in sunflower by whole-genome resequencing. International Journal of Molecular Sciences. 21(24):9571. https://doi.org/10.3390/ijms21249571.
Ma, G., Long, Y., Song, Q., Talukder, Z.I., Shamimuzzaman, M., Qi, L. 2021. Map and sequence-based chromosome walking towards cloning of the male fertility restoration gene Rf5 linked to R11 in sunflower. Scientific Reports. https://doi.org/10.1038/s41598-020-80659-6.
Gilio, T., Hurtado-Gonzales, O.P., Goncalves-Vidigal, M.C., Valentini, G., Ferreira Elias, J.C., Song, Q., Pastor Corrales, M.A. 2020. Fine mapping of an anthracnose-resistance locus in Andean common bean cultivar Amendoim Cavalo. PLOS ONE. 15(10):0239763. https://doi.org/10.1371/journal.pone.0239763.
Liu, J.Y., Zhang, Y.W., Han, X., Zuo, J.F., Zhang, Z.B., Shang, H.H., Song, Q., Zhang, Y.M. 2020. Evolutionary population structure model reveals pleiotropic effects of GmPDAT for seed oil- and size-related traits in soybean. Journal of Experimental Botany. 71(22):6988-7002. https://doi.org/10.1093/jxb/eraa426.
Song, Q., Yan, L., Quigley, C.V., Fickus, E.W., Wei, H., Chen, L., Dong, F., Arya, S., Liu, J., Hyten, D., Pantalone, V., Nelson, R.L. 2020. Soybean BARCSoySNP6K - An assay for soybean genetics and breeding research. Plant Journal. https://doi.org/10.1111/tpj.14960.
Zhu, Q., Escamilla, D.M., Wu, X., Song, Q., Li, S., Rosso, L., Lord, N., Xie, F., Zhang, B. 2020. Identification and validation of major QTLs associated with low seed coat deficiency of natto soybean seeds (Glycine max L.). Theoretical and Applied Genetics. 133(1):3165-3176. https://doi.org/10.1007/s00122-020-03662-5.
Bornowski, N., Song, Q., Kelly, J. 2020. QTL Mapping of Post-Processing Color Retention in Two Black Bean Populations. Theoretical and Applied Genetics. https://doi.org/10.1007/s00122-020-03656-3.
Goncalves-Vidigal, M.C., Gilio, T., Valentini, G., Vaz-Bisneta, M., Vidigal Filho, P.S., Song, Q., Oblessuc, P.R., Melotto, M. 2020. New Andean source of resistance to anthracnose and angular leaf spot: fine-mapping of disease-resistance genes in California Dark Red Kidney common bean cultivar. PLoS One. 15(6):e0235215. https://doi.org/10.1371/journal.pone.0235215.