Skip to main content
ARS Home » Southeast Area » Stoneville, Mississippi » Genomics and Bioinformatics Research » Research » Research Project #434717

Research Project: Applied Agricultural Genomics and Bioinformatics Research

Location: Genomics and Bioinformatics Research

2023 Annual Report


Objectives
1. Advance and accelerate translational research for ARS and its collaborators that addresses the agricultural needs of primarily the Southeast region, through data generation, data analysis, and data management, with an emphasis on genomic approaches and on crop, animal, insect, and microbiome analyses; support germplasm analysis for breeding and for trait genetic and molecular analyses; and support gene expression analysis and gene discovery. 1.A. A cross section of GBRU operations in genomics and bioinformatics. 1.B. Specific ongoing collaborative projects. 1.C. Data Management. 2. Accelerate ARS bioinformatics community development and capacity building, primarily for the Southeast region, through training workshops, webinars, and direct project participation; develop and evaluate new tools, workflows, and systems that enable ARS and its collaborators to more efficiently manage, analyze, and share diverse streams of biological data and knowledge, including high throughput genotyping and phenotyping, thereby enhancing crop and animal genetic improvement, health, and nutrition. 2.A. Bioinformatics community development and capacity building. 2.B. Development of new tools and procedures.


Approach
The Genomics and Bioinformatics Research Unit’s (GBRU) primary function is conducting research in the areas of bioinformatics and genomics on a wide array of species and topics. Genomic technologies are powerful tools for germplasm improvement using marker assisted selection (MAS), biotechnology, or synthetic biology, and for analyzing associated biological processes (genetics, physiology, cell and molecular biology, biochemistry, and evolutionary biology). Thus, many ARS scientists, e.g., crop and animal breeders, have a direct need for genomic tools in their research. Others, e.g., soil scientists, can enhance their research dramatically using genomic tools to analyze the microbiome, if the technologies and appropriate expertise are available. However, not all ARS locations have sufficient resources to support core genomic technologies. Thus, the mission of the ARS Genomics and Bioinformatics Research Unit (GBRU), is to: (1) coordinate, facilitate, collaborate and conduct genomics and bioinformatics research emphasizing the Southeast region; (2) serve as a research and training resource for genomic technologies and bioinformatic analyses in support of ARS scientists and their collaborations; and (3) serve as a technical resource for ARS research programs that have not typically utilized these technologies, and aid in their development of genomic resources. Within the GBRU, this research project will conduct and collaborate on genome sequencing, sequence assembly and analysis, diversity analysis, marker development, haplotyping, physical and genetic map production, and transcription profiling research. Thus, essential product development includes new and improved reference genomes for plants, animals, insects, fish, and microbes that enable genomics-assisted breeding; new physical and genetic maps; improved cultivars, germplasm, or breeding lines; and new information on key agricultural problems such as disease resistance and drought tolerance.


Progress Report
This is the final report for this project. Refer to new project 6066-21310-006-000 for additional information. The Unit continues to develop new approaches and tools for applied research and breeding program activities. Two new tools were developed: 1) a machine learning and computer vision-based tool which can automatically detect two cell shapes in young cotton fibers using microscope images, and 2) an R/Shiny application designed to streamline the downstream pre-processing of cotton genotyping data also enabling automatic preparation to upload raw data to the CottonGen database. These are valuable tools to be integrated into modern breeding practices. Breeding Insight OnRamp (BI OnRamp) has continued to support four commodities, including citrus, sugarcane, soybean, and cotton. Sugarcane work has become long-term with the addition of specific funds into the in-house project starting in FY2023; which has led to the development of a cross USDA initiative, the Sugarcane Integrated Breeding System. For all commodities supported by BI OnRamp, support continues to expand to prepare for utilizing database applications for field analyses, archiving historical data, developing trait ontologies, and initiating development of advanced genotyping capabilities. Pioneering passive genomic surveillance of pathogens for biodefense: When a new disease appears in the USA, containment can hinge on rapidly determining if it has been spreading undetected. Before detection, pathogens can be incidentally sequenced along with genomes and metagenomes but never reported because those pathogens are not known or are irrelevant to the original study. These pathogens are buried in 14,000 terabytes of Sequence Read Archive (SRA) data held by the National Center for Biotechnology Information. SRA was not previously searchable by sequence. The Unit collaboratively developed indexing and search methods to search over 700,000 metagenomes for a novel bacterial genome in about a minute. The search tool also provides critical environmental data about each metagenome summarized graphically which enables rapid disease epidemiology. A new tool has been developed to visualize simple repetitive structure within a genome. The Asian Giant Hornet has a very unique centromeric regions that were previously hard to characterize and insert into the genome’s structure. The new tool has allowed for the development of whole chromosome assemblies and has defined centromeric DNA sequences never visualized in other genomes. The tool and discovery of these unique centromeres should help in genome assemblies of other species and enhance studies on centromeres. Twenty-five projects using advanced statistics, visual recognition, and/or advanced computer programming were developed across the Southeast Area with students and university mentors at the University of Texas Arlington (a Hispanic serving institute). Under this program, undergraduate and graduate students are exposed to agriculture problems and then use their computational abilities to help ARS scientists to address research problems where such computational skillsets are not available in-house. The program represents a unique way to enhance ARS research while also enhancing the education of students. Five-year summary: In addition to providing genomic services to its clients, the Unit at Stoneville, MS participated in many collaborative projects. Since late 2017, the Unit has co-published ~96 publications. The publications cover a wide range of topics and are representative of the work conducted by the Unit and its collaborators. SCINet (ARS’s high-performance initiative) contributions by the Unit: During the past project plan, Unit members were leaders in the development of ARS’s high-performance computational infrastructure. One was the Chief Scientific Information Officer for the whole of ARS for three years and had responsibility for overall SCINet development and implementation. Another was the chair for the Scientific Advisory Committee (SAC) for over five years. Another serves as the location SciNet Scientific Advisory contact. The SAC represents the scientific interests within SCINet and facilitates training, education, and software acquisition. Genomic characterization of U.S. rice germplasm collections and deleterious load: For self-fertilizing crops, the century-spanning U.S. germplasm collection is an invaluable resource for understanding how selection for yield affects different mutations. Long-read sequencing information was used to characterize the entire mutational spectrum between rice varieties. These mutations were tracked through the last century of rice breeding and showed that large structural mutations in exons are selected against at a greater rate than any other mutational class. These findings illuminate the nature of deleterious alleles and will guide attempts to predict variety vigor based solely on genomic information. Development and implementation of Ag100Pest: In the summer of 2018, the Unit was approached to help develop Ag100Pest which is a program to produce high quality reference genomes for leading agricultural pests. Unit members have been critical contributors to the development and implementation of the program. Under Unit guidance, it was shown that a reference quality genome could be generated from a single insect. The program has developed over the years and refined its techniques so now DNA sequence data has been generated for over 160 insect species including very small insect (flour weevil) to large ones (Asian Giant Hornet) and with a range of genome sizes from less than 300 Mbp to close to 9 Gbp. Cultivated allotetraploid genomes sequenced with unprecedented accuracy: Earlier attempts to generate a peanut genome required using its progenitor diploid parents. While this was helpful, it did not provide a completely accurate template to study cultivated peanut. Working with an international group, ARS researchers in Stoneville, Mississippi, contributed to the genome of a very high-quality genome of cultivated peanut. This information has been used by the peanut research community to associate DNA markers with important agricultural traits. Wild diploid peanut genomes are also sources of vital novel genetic variation. The Unit has worked with collaborators to quantify this utility and produced numerous wild peanut genomes for future utilization of synthetic allotetraploids. Genomic diversifications of five Gossypium allopolyploid species and their impact on cotton improvement: Cotton research collaborations have led to the availability of high-quality reference genomes from five allotetraploid species, including both polyploid Pima and Upland species. This information will help ARS researchers to investigate the small number of differences that are likely to be extremely important for the observed phenotypic differences among the species. Software for agronomic advancement: Quantitative Insights into Microbial Ecology (QIIME) is the most popular software program for microbiome science, but the first version lacked capacity to perform certain analysis with fungi. ARS researchers in Stoneville, Mississippi, wrote part of the portion of this software that analyzes fungi, making it possible to identify and quantify the microscopic fungi in the environment. In addition, members developed, and published simulation software specifically designed for the structured populations used in plant breeding and crop improvement. Numerous additional software resources have been made available through the Unit's GitHub and/or as web applications. High-quality spinach genomes and sex chromosome elucidation: A high-quality spinach genome that closely represented the chromosome structure of the plant allowed for studying whole genome duplications in the understudied euasterid group of flowering plants. The spinach genome showed that the genome of this plant had undergone many duplications in the past (many millions of years ago) that were then followed by extensive gene rearrangements. This evolutionary history had previously been difficult to analyze and identify without a high-quality reference genome. Additionally, 75 spinach lines were whole genome sequenced, identifying variants across the diversity of spinach. One important finding was that spinach germplasm is maintained as collections and this needs to be included in designing experiments and performing genetic analyses, especially when looking to identify genes responsible for a trait. Utilizing the high-quality sequence, an additional sequence was developed from a super-male spinach plant which allowed comparison and elucidation of the spinach sex chromosome differentiation. Preventing Dengue virus in Florida: Metagenomics by the Unit- identified Dengue virus being transmitted transovarial in Florida prior to any human infection. To date, the current U.S. public health system’s response to outbreaks has been largely reactive, but this research shows that by monitoring mosquito populations it may be possible to identify emerging mosquito borne diseases in high-risk, high-tourism areas of the United States to enable proactive, targeted vector control before potential outbreaks. This work by the Unit led Miami-Dade Mosquito Control to adopt a prospective molecular monitoring approach and resulted in the detection and targets spraying of Dengue in Dade County in 2021.


Accomplishments
1. Expanding the breadth of peanut genomes. There are four major types of peanut cultivars grown in the USA and each requires its own unique breeding program. To assist in breeding efforts a high-quality genome from each type is highly desirable. The Virginia type cultivars are considered to produce premium peanuts and under this project a genome was produced. This genome has been integrated into PeanutBase (a public facing database) and reflects the highest quality genome for the commodity to date. ARS researchers in Stoneville, Mississippi, released the genome along with a new cost-effective genotyping platform, which will enable breeding program scale genotyping activities.

2. Improved gene discovery through pangenomes. Disease genes can be very difficult to identify and characterize because they often represent areas of the genome that are duplicated. A new method was developed and utilized by ARS researchers in Stoneville, Mississippi, to map a unique disease resistance gene in melons. Working with ARS researchers in Charleston, South Carolina, two of the most complete melon genomes available in the world were developed and released. Using a newly developed computational approach, these genomes were used simultaneously for more accurate genetic analysis. The usefulness was exemplified by its ability to identify the causative genes for three different fungal resistance traits in melon within the same study - something that would not have been possible using older sequencing technology and bioinformatic methods.


Review Publications
Yu, S., Schoonmaker, A., Yan, L., Hulse-Kemp, A.M., Fontanier, C.H., Martin, D.L. 2022. Genetic variability and QTL mapping of winter survivability and leaf firing in African bermudagras. Crop Science. https://doi.org/10.1002/csc2.20849.
Stahlke, A.R., Chen, J., Tembrock, L.R., Sim, S.B., Chudalayandi, S., Geib, S.M., Scheffler, B.E., Perera, O.P., Gilligan, T.M., Childers, A.K., Hackett, K.J., Coates, B.S. 2022. A chromosome-scale genome assembly of a Helicoverpa zea strain resistant to Bacillus thuringiensis Cry1Ac insecticidal protein. Genome Biology and Evolution. 15(3). Article evac131. https://doi.org/10.1093/gbe/evac131.
Vaughn, J.N., Branham, S.E., Abernathy, B.L., Hulse-Kemp, A.M., Rivers, A.R., Levi, A., Wechter, W.P. 2022. Graph-based pangenomics maximizes genotyping density and reveals structural impacts on fungal resistance in melon. Nature Genetics. https://doi.org/10.1038/s41467-022-35621-7.
Kloppe, T., Whetten, R.B., Kim, S., Powell, O., Luck, S., Douchkov, D., Whetten, R., Hulse-Kemp, A.M., Balint Kurti, P.J., Cowger, C. 2023. Two pathogen loci determine Blumeria graminis f. sp. tritici virulence to wheat resistance gene Pm1a. New Phytologist. 238:1546-1561. https://doi.org/10.1111/nph.18809.
Waldbieser, G.C., Liu, S., Yuan, Z., Older, C.E., Gao, D., Shi, C., Bosworth, B.G., Li, N., Boa, L., Kirby, M.A., Jin, Y., Wood, M.L., Scheffler, B.E., Simpson, S.A., Youngblood, R.C., Duke, M.V., Ballard, L.L., Phillipy, A., Koren, S., Liu, Z. 2023. Reference genomes of channel catfish and blue catfish reveal multiple pericentric chromosome inversions. BMC Biology. 21:67. https://doi.org/10.1186/s12915-023-01556-8.
Gao, G., Waldbieser, G.C., Ramey, Y.C., Zaho, D., Pietrak, M.R., Stannard, J.A., Buchman, J.T., Scheffler, B.E., Peterson, B.C., Palti, Y., Rexroad III, C.E., Long, R., Burr, G.S., Milligan, M.T. 2023. The generation of the first chromosome-level de-novo genome assembly and the development and validation of a 50K SNP array for the St John River aquaculture strain of North American Atlantic salmon. G3, Genes/Genomes/Genetics. jkad138. https://doi.org/10.1093/g3journal/jkad138.
Davis, J.S., Sim, S.B., Geib, S.M., Scheffler, B.E., Linnen, C.R. 2023. Whole-genome resequencing data support a single introduction of the invasive white pine sawfly, Diprion similis. Journal of Heredity. 114(3):246-258. https://doi.org/10.1093/jhered/esad012.
Stahlke, A.R., Chang, J., Chudalayandi, S., Heu, C.C., Geib, S.M., Scheffler, B.E., Childers, A.K., Fabrick, J.A. 2023. Chromosome-scale genome assembly of the pink bollworm, Pectinophora gossypiella, a global pest of cotton. G3, Genes/Genomes/Genetics. 13(4). Article jkad040. https://doi.org/10.1093/g3journal/jkad040.
Newman, C.S., Andres, R.J., Youngblood, R.C., Campbell, J.D., Simpson, S.A., Cannon, S.B., Scheffler, B.E., Oakley, A.T., Hulse-Kemp, A.M., Dunne, J.C. 2023. Initiation of genomics-assisted breeding in Virginia-type peanuts through the generation of a de novo reference genome and informative markers. Frontiers in Plant Science. 13.Article 1073542. https://doi.org/10.3389/fpls.2022.1073542.
Valles, S.M., Zhao, C., Rivers, A.R., Iwata, R.L., Oi, D.H., Cha, D.H., Collignon, R., Cox, N.A., Morton, G.J., Calcaterra, L.A. 2023. RNA virus discoveries in the electric ant, Wasmannia auropunctata. Virus Genes. 59:276–289. https://doi.org/10.1007/s11262-023-01969-1.
Schoonmaker, A., Hulse-Kemp, A.M., Youngblood, R.C., Rahmat, M., Iqbal, M., Mehboob-Ur-, R., Kochan, K.J., Scheffler, B.E., Scheffler, J.A. 2023. Detecting cotton leaf curl virus resistance quantitative trait Loci in Gossypium hirsutum and iCottonQTL a new R/Shiny app to streamline genetic mapping in cotton. Plants. https://doi.org/10.3390/plants12051153.
Huff, M., Babiker, E.M., Hulse-Kemp, A.M., Scheffler, B.E., Youngblood, R.C., Simpson, S.A., Staton, M. 2023. Long-read, chromosome-scale assembly of Vitis rotundifolia cv. Carlos and its unique resistance to Xylella fastidiosa subsp. fastidiosa. BMC Genomics. https://doi.org/10.1186/s12864-023-09514-y.
Chang, J., Marczuk-Rojas, J.P., Waterman, C., Garcia-Llanos, A., Chen, S., Ma, X., Hulse-Kemp, A.M., Van Deynze, A., Van De Peer, Y., Carretero-Paulet, L. 2022. Chromosome-scale assembly of the Moringa oleifera Lam. genome uncovers polyploid history and evolution of secondary metabolism pathways through tandem duplication. The Plant Genome. https://doi.org/10.1002/tpg2.20238.
Reddy, K.R., Bheemanahalli, R., Saha, S., Lokhande, S.B., Read, J.J., Jenkins, J.N., Raska, D.A., De Santiago, L., Hulse-Kemp, A.M., Vaughn, R.N., Stelly, D.M. 2020. High-temperature and drought-resilience traits among interspecific chromosome substitution lines for genetic improvement of Upland cotton. Plants. 9:1747. https://doi.org/10.3390/plants9121747.
Bertioli, D.J., Clevenger, J., Godoy, I., Stalker, T., Wood, S., Santos, J., Ballen-Taborda, C., Abernathy, B., Azevedo, V., Campbell, J.D., Chavarro, C., Chu, Y., Farmer, A.D., Fonceka, D., Gao, D., Grimwood, J., Halpin, N., Korani, W., Michelotto, M.D., Ozias-Akins, P., Vaughn, J.N., Youngblood, R., Moretzsohn, M.C., Wright, G.C., Jackson, S.A., Cannon, S.B., Scheffler, B.E., Leal-Bertioli, S.M. 2021. Legacy genetics of Arachis cardenasii in the peanut crop shows the profound benefits of international seed exchange. Proceedings of the National Academy of Sciences(PNAS). 118(38). Article e2104899118. https://doi.org/10.1073/pnas.2104899118.
Yu, J., Hulse-Kemp, A.M., Babiker, E.M., Staton, M. 2021. High-quality reference genome and annotation aids understanding of berry development for evergreen blueberry (Vaccinium darrowii). Horticulture Research. 8:228. https://doi.org/10.1038/s41438-021-00641-9.
Childers, A.K., Geib, S.M., Sim, S.B., Poelchau, M.F., Coates, B.S., Simmonds, T.J., Scully, E.D., Smith, T.P.L., Childers, C., Corpuz, R.L., Hackett, K.J., Scheffler, B.E. 2021. The USDA-ARS Ag100Pest Initiative: High-quality genome assemblies for agricultural pest insect research. Insects. 12(7):626. https://doi.org/10.3390/insects12070626.
Rivers, A.R., Grodowitz, M.J., Miles, G.P., Allen, M.L., Elliott, B., Weaver, M.A., Bon, M., Rojas, M.G., Morales Ramos, J.A. 2022. Gross morphology of diseased tissues in Nezara viridula (Hemiptera: Pentatomidae) and molecular characterization of an associated microsporidian. Journal of Insect Science. 22(2):4. https://doi.org/10.1093/jisesa/ieac013.
Billings, G.T., Jones, M.A., Rustgi, S., Bridges, W.C., Holland, J.B., Hulse-Kemp, A.M., Campbell, B.T. 2022. Outlook for implementation of genomics-based selection in public cotton breeding programs. Plants. 11(11). https://doi.org/10.3390/plants11111446.
Allen, M., Hulse-Kemp, A.M., Storm, A.R. 2022. Gossypium hirsutum gene of unknown function, Gohir.A02G044702.1, encodes a potential B3 transcription factor of the REM subfamily. microPublication Biology. https://doi.org/10.17912/micropub.biology.000574.
Spalink, D., Stoffel, K., Hill, T.A., Hulse-Kemp, A.M., Walden, G., Van Deynze, A., Bohs, L. 2018. Comparative transcriptomics and genomics patterns of discordance in Capsiceae (Solanaceae). Systematic Biology. 126(2018):293-302. https://doi: 10.1016/j.ympev.2018.04.030.
Smith, M.W., Herfort, L., Rivers, A.R., Simon, H.M. 2019. Genomic signatures for sedimentary microbial utilization of phytoplankton detritus in a fast-flowing estuary. Frontiers in Microbiology. Volume 10, Article 2475. https://doi.org/10.3389/fmicb.2019.02475.
Arias De Ares, R.S., Cazon, I., Massa, A.N., Scheffler, B.E., Sobolev, V., Lamb, M.C., Duke, M.V., Simpson, S.A., Conforto, C., Paredes, J., Soave, J., Buteler, M., Rago, A.M. 2019. Mitogenome and nuclear-encoded fungicide-target genes of Thecaphora frezii- causal agent of peanut smut. Fungal Genomics and Biology. (9)1:160. https://doi.org/10.35248/2165-8056.19.9.160.
Rivers, A.R., Weber, K.C., Gardner, T.G., Liu, S., Armstrong, S. 2018. ITSxpress: software to rapidly trim internally transcribed spacer sequences with quality scores for marker gene analysis. F1000Research. 7:1418. https://doi.org/10.12688/f1000research.15704.1.
Gao, M., Glenn, A.E., Gu, X., Mitchell, T.R., Satterlee, T., Duke, M.V., Scheffler, B.E., Gold, S.E. 2020. Pyrrocidine, a molecular off switch for fumonisin biosynthesis. PLoS Pathogens. 6;16(7):e1008595. https://doi.org/10.1371/journal.ppat.1008595.