Skip to main content
ARS Home » Northeast Area » Leetown, West Virginia » Cool and Cold Water Aquaculture Research » Research » Publications at this Location » Publication #409973

Research Project: Integrated Research Approaches for Improving Production Efficiency in Rainbow Trout

Location: Cool and Cold Water Aquaculture Research

Title: Genotype imputation from low-coverage whole-genome sequencing data in rainbow trout

Author
item Liu, Sixin
item MARTIN, KYLE - Troutlodge, Inc
item Long, Roseanna
item Leeds, Timothy - Tim
item Vallejo, Roger
item Wiens, Gregory - Greg
item Palti, Yniv

Submitted to: Annual International Plant & Animal Genome Conference
Publication Type: Abstract Only
Publication Acceptance Date: 11/13/2023
Publication Date: 1/13/2024
Citation: Liu, S., Martin, K.E., Long, R., Leeds, T.D., Vallejo, R.L., Wiens, G.D., Palti, Y. 2024. Genotype imputation from low-coverage whole-genome sequencing data in rainbow trout. Annual International Plant & Animal Genome Conference. PO0433.

Interpretive Summary:

Technical Abstract: With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing followed by genotype imputation is becoming a cost-effective alternative genotyping method. The objectives of this study were 1) to construct a haplotype reference panel for genotype imputation from low-coverage whole-genome sequencing data in rainbow trout; and 2) to evaluate the concordance rates between imputed genotypes and SNP-array genotypes. To establish a haplotype reference panel for genotype imputation, high-coverage whole-genome sequences (an average sequence depth of 12x) were obtained for a total of 410 fish representing four different spawning date groups, February, May, August and November. The sequence reads were mapped to the Arlee reference genome, and software GATK was used to call SNPs. After data filtering, 20,434,612 biallelic SNPs were retained. Based on principal component analysis, the 410 fish were clustered into four groups consistent with their spawning dates. The reference panel was phased with software SHAPEIT5, and was used to impute genotypes from low-coverage sequence data using software GLIMPSE2. A total of 90 fish from the Troutlodge November breeding population were sequenced with an average sequence depth 1.3x, and these fish were also genotyped with the Axiom 57k SNP array. The concordance rate between array-based genotypes and imputed genotypes was 99.1%. To evaluate the imputation accuracy at lower read coverage, we down-sampled the sequence coverage to 0.5x, 0.2x and 0.1x, and the concordance rates between array-based genotypes and imputed genotypes were 98.7%, 97.8% and 96.7%, respectively. To further evaluate the accuracy of SNP genotypes imputation, 109 fish from the breeding program at the National Center for Cool and Cold Water Aquaculture were sequenced and genotyped with the 57k SNP array. After down-sampling the sequence coverage to 0.5x, the concordance rate between array-based genotypes and imputed genotypes was 97.8%. In conclusion, the reference haplotype panel reported in this study can be used to accurately impute genotypes from low-coverage sequencing data in rainbow trout breeding populations.