Skip to main content
ARS Home » Midwest Area » Madison, Wisconsin » Vegetable Crops Research » Research » Publications at this Location » Publication #381949

Research Project: Management of Genetic Resources and Associated Information in the U. S. Potato Genebank

Location: Vegetable Crops Research

Title: Assessing SNP heterozygosity in potato (Solanum) species— bias due to missing and non-allelic genotypes

Author
item Bamberg, John
item DEL RIO, ALFONSO - University Of Wisconsin
item LOUDERBACK, LISBETH - University Of Utah
item PAVLIK, BRUCE - University Of Utah

Submitted to: American Journal of Potato Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 8/31/2021
Publication Date: 10/18/2021
Citation: Bamberg, J.B., Del Rio, A., Louderback, L., Pavlik, B. 2021. Assessing SNP heterozygosity in potato (Solanum) species— bias due to missing and non-allelic genotypes. American Journal of Potato Research. 98, pages374–383 (2021). https://doi.org/10.1007/s12230-021-09849-w.
DOI: https://doi.org/10.1007/s12230-021-09849-w

Interpretive Summary: Potato has a wealth of wild and cultivated relatives. The US keeps and distributes a collection of such populations for breeding and research at the US Potato Genebank at Sturgeon Bay, Wisconsin. Understanding patterns of genetic diversity would help genebank staff keep the maximum number of useful genes in the most efficient set of stocks. We examined a popular, relatively new system of DNA analysis as applied to many individuals within families of four potato species. Two factors that could bias conclusions were detected: a) When data is missing from some plants, the remaining plants that have data appear to be less genetically diverse, and b) Some data points can be identified as artifacts for elimination from the analysis because they show a pattern that could not reasonably happen in nature. When these adjustments to the data are made, previously-reported claims that wild species populations are less genetically diverse than cultivated species was not supported, highlighting the genetic richness value of wild species.

Technical Abstract: Potato has about 100 related wild Solanum species growing naturally in the Americas. The US Potato Genebank aims to keep samples useful for research and breeding to improve the crop, often in the form of botanical seed families. A key component of genebank efficiency is assessing diversity within and among populations, and DNA marker sequence diversity is a powerful proxy for trait diversity. We previously reported on three factors which can cause under-estimation of heterozygosity: ascertainment, allele frequency, and ploidy bias. We here report, using GBS data for four diploid potato species, that average percent of apparent heterozygosity increases as data is more complete—the maximum difference was 2% heterozygotes when only a few individuals are called, to 36% when nearly all individuals were called. However, there was evidence that estimates of average heterozygosity based only on loci for which every individual has data can also be biased upward. Implausibly high levels of heterozygosity suggest non-segregating non-homologous SNPs, which occurred as 5-9% of all loci with complete data. We propose that best estimates of average heterozygosity in unselected seedlings should be based on loci with data for all samples after eliminating those loci that appear to be artificially fixed as heterozygous, which reduces observed heterozygote frequency by 16-26%. On that basis, the wild species examined have similar heterozygosity to the cultivated phureja.