Author
SWARTS, KELLY - Cornell University | |
LI, HUIHUI - International Maize & Wheat Improvement Center (CIMMYT) | |
NAVARRO, J. ALBERTO - Cornell University | |
AN, DONG - China Agricultural University | |
ROMAY, MARIA CINTA - Cornell University | |
HEARNE, SARAH - International Maize & Wheat Improvement Center (CIMMYT) | |
ACHARYA, CHARLOTTE - Cornell University | |
GLAUBITZ, JEFFREY - Cornell University | |
MITCHELL, SHARON - Cornell University | |
ELSHIRE, ROBERT - Agresearch | |
Buckler, Edward - Ed | |
Bradbury, Peter |
Submitted to: The Plant Genome
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 7/29/2014 Publication Date: 9/26/2014 Citation: Swarts, K., Li, H., Navarro, J., An, D., Romay, M., Hearne, S., Acharya, C., Glaubitz, J.C., Mitchell, S., Elshire, R.J., Buckler Iv, E.S., Bradbury, P. 2014. Novel methods to optimize genotypic imputation for low-coverage, next-generation sequence data in crop plants. The Plant Genome. 7(3). DOI: 10.3835/plantgenome2014.05.0023. Interpretive Summary: Next-generation sequencing of DNA from crop plants provides a low-cost method to produce data on a very large number of nucleotide variants and, as a result, holds great promise for plant geneticists and breeders. To keep per sample costs low, the resulting data is often low-coverage, and some of the heterozygous loci may not be identified accurately. Swarts et al. describe two computational methods that can be used to overcome these problems. The methods start by identifying the population haplotypes then use a hidden Markov model to find the most likely genotype of each sample analyzed. Using large maize populations representing collections of diverse inbreds, full sib families, and landraces, they show that the methods are very accurate and compare favorably to Beagle 4.0, a widely used software package for imputing genotypes Technical Abstract: Next-generation sequencing technology such as genotyping-by-sequencing (GBS) made low-cost, but often low-coverage, whole-genome sequencing widely available. Extensive inbreeding in crop plants provides an untapped, high quality source of phased haplotypes for imputing missing genotypes. We introduce Full-Sib Family Haplotype Imputation (FSFHap), optimized for full-sib populations, and a generalized method, Fast Inbred Line Library ImputatioN (FILLIN), to rapidly and accurately impute missing genotypes in GBS-type data with ordered markers. FSFHap and FILLIN impute missing genotypes with high accuracy in GBS-genotyped maize (Zea mays L.) inbred lines and breeding populations, while Beagle v. 4 is still preferable for diverse heterozygous populations. FILLIN and FSFHap are implemented in TASSEL 5.0. |