Location: Cool and Cold Water Aquaculture Research
Title: Genotype imputation from low-coverage whole-genome sequencing data in rainbow troutAuthor
Liu, Sixin | |
MARTIN, KYLE - Troutlodge, Inc | |
Long, Roseanna | |
Leeds, Timothy - Tim | |
Vallejo, Roger | |
Wiens, Gregory - Greg | |
Palti, Yniv |
Submitted to: Annual International Plant & Animal Genome Conference
Publication Type: Abstract Only Publication Acceptance Date: 11/13/2023 Publication Date: 1/13/2024 Citation: Liu, S., Martin, K.E., Long, R., Leeds, T.D., Vallejo, R.L., Wiens, G.D., Palti, Y. 2024. Genotype imputation from low-coverage whole-genome sequencing data in rainbow trout. Annual International Plant & Animal Genome Conference. PO0433. Interpretive Summary: Technical Abstract: With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing followed by genotype imputation is becoming a cost-effective alternative genotyping method. The objectives of this study were 1) to construct a haplotype reference panel for genotype imputation from low-coverage whole-genome sequencing data in rainbow trout; and 2) to evaluate the concordance rates between imputed genotypes and SNP-array genotypes. To establish a haplotype reference panel for genotype imputation, high-coverage whole-genome sequences (an average sequence depth of 12x) were obtained for a total of 410 fish representing four different spawning date groups, February, May, August and November. The sequence reads were mapped to the Arlee reference genome, and software GATK was used to call SNPs. After data filtering, 20,434,612 biallelic SNPs were retained. Based on principal component analysis, the 410 fish were clustered into four groups consistent with their spawning dates. The reference panel was phased with software SHAPEIT5, and was used to impute genotypes from low-coverage sequence data using software GLIMPSE2. A total of 90 fish from the Troutlodge November breeding population were sequenced with an average sequence depth 1.3x, and these fish were also genotyped with the Axiom 57k SNP array. The concordance rate between array-based genotypes and imputed genotypes was 99.1%. To evaluate the imputation accuracy at lower read coverage, we down-sampled the sequence coverage to 0.5x, 0.2x and 0.1x, and the concordance rates between array-based genotypes and imputed genotypes were 98.7%, 97.8% and 96.7%, respectively. To further evaluate the accuracy of SNP genotypes imputation, 109 fish from the breeding program at the National Center for Cool and Cold Water Aquaculture were sequenced and genotyped with the 57k SNP array. After down-sampling the sequence coverage to 0.5x, the concordance rate between array-based genotypes and imputed genotypes was 97.8%. In conclusion, the reference haplotype panel reported in this study can be used to accurately impute genotypes from low-coverage sequencing data in rainbow trout breeding populations. |