Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Animal Genomics and Improvement Laboratory » Research » Publications at this Location » Publication #311909

Title: selectSNP – An R package for selecting SNPs optimal for genetic evaluation

Author
item WU, XIAO-LIN - Geneseek Inc, A Neogen Company
item MCQUISTAN, ADAM - Geneseek Inc, A Neogen Company
item BAUCK, STEWART - Geneseek Inc, A Neogen Company
item Wiggans, George

Submitted to: Plant and Animal Genome Conference Proceedings
Publication Type: Abstract Only
Publication Acceptance Date: 1/12/2015
Publication Date: 1/12/2015
Citation: Wu, X., Mcquistan, A., Bauck, S., Wiggans, G.R. 2015. selectSNP – An R package for selecting SNPs optimal for genetic evaluation. Plant and Animal Genome Conference Proceedings. San Diego, CA, January 10-14, P0266.

Interpretive Summary:

Technical Abstract: There has been a huge increase in the number of SNPs in the public repositories. This has made it a challenge to design low and medium density SNP panels, which requires careful selection of available SNPs considering many criteria, such as map position, allelic frequency, possible biological functions, and so on. Exhaustive search for an optimal SNP set with these criteria is an impossible task, given the current computational architectures and performance. The selectSNP package utilizes a “heuristically local optimization algorithm”, which we developed, to design optimal SNP panels for genetic evaluation including genomic prediction. There are a few unique features of this package. First, the distribution of SNP location can be either uniform or non-uniform. In the latter case, there will be a varying extent of enrichment of SNPs at each end of the chromosomes, depending on the tuning parameter value. Second, the heuristic, local optimization algorithm is designed to select sets of SNPs with optimal information content as well as good coverage of SNPs on the genome. Lastly, it allows pre-inclusion of a list of “obligatory” SNPs, which may be the ones used for parentage testing, breed identification, and various kinds of genetic diagnoses, or candidate loci (genes), and the remaining set of SNPs are optimized accordingly, conditional on the inclusion of this “obligatory” list. Impact of this new algorithms on imputation and genomic prediction is shown with real data results.