Skip to main content
ARS Home » Plains Area » Miles City, Montana » Livestock and Range Research Laboratory » Research » Publications at this Location » Publication #397792

Research Project: Alleviating Rate Limiting Factors that Compromise Beef Production Efficiency

Location: Livestock and Range Research Laboratory

Title: Fuzzy logic as a strategy for combining marker statistics to optimize preselection of high-density and sequence genotype data

Author
item Ling, Ashley
item Hay, El Hamidi
item AGGREY, SAMUEL - University Of Georgia
item REKAYA, ROMDHANE - University Of Georgia

Submitted to: Genes
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/7/2022
Publication Date: 11/7/2022
Citation: Ling, A.S., Hay, E.A., Aggrey, S., Rekaya, R. 2022. Fuzzy logic as a strategy for combining marker statistics to optimize preselection of high-density and sequence genotype data. Genes. 12(11). Article 2100. https://doi.org/10.3390/genes13112100.
DOI: https://doi.org/10.3390/genes13112100

Interpretive Summary: With the advent of high throughput genetic analysis technology, a large amount of genomic data is being generated. However, extracting genomic data information is crucial and using high throughput technology in selecting of animals is still less than optimal. In this simulation study, an new approach for dealing with the high dimensionality of genomic data was explored. The approach combines statistics used in the preselection and prioritization of single nucleotide polymorphism (SNP) markers from panels with large numbers of SNPs (approximately 1.3 million ) into a composite “fuzzy” ranking score that attempts to mimic human decision making .. The accuracy of genomic predictions for preselected panel sizes of 1-50k that used “fuzzy” scoring ranged from -0.4 to 11.7% higher than an existing approaches. This fuzzy information scoring approach has the potential to aggregate information from multiple criteria that better reflect SNP trait associations and biological relevance in a flexible and efficient way to yield higher quality genomic predictions.

Technical Abstract: The high dimensionality of genotype data available for genomic evaluations has presented a motivation for developing strategies to identify subsets of markers capable of increasing the accuracy of predictions compared to the current commercial SNP chips. In this simulation study, an algorithm for combining statistics used in the preselection and prioritization of SNP markers from high-density panel (1.3 million SNPs) into a composite “fuzzy” ranking score based on a Sugeno-type fuzzy inference system was developed and evaluated for performance in preselection for genomic predictions. FST scores, and p-values were evaluated as inputs for the fuzzy inference system. The accuracy of genomic predictions for fuzzy-score-preselected panel sizes of 1-50k ranged from -0.4 to 7.8, 1.0 to 11.7, and 0.9 to 7.7% higher than FST score and -0.3-3.8, 0.1-1.8, and 0.7-1.9% higher than p-value preselection of equivalent-sized panels for genetic architectures of 300, 1k, and 5k causative variants, respectively. Though gains in prediction accuracies using only two inputs to the fuzzy inference system were modest, preselection based on fuzzy scores yielded more accurate predictions than both FST scores and p-values for the majority of evaluated panel sizes under all genetic architectures. Fuzzy inference systems have potential to aggregate information from multiple criteria that reflect SNP-trait associations and biological relevance in a flexible and efficient way to yield higher quality genomic predictions.