Location: Plant, Soil and Nutrition Research
Title: Using public databases for genomic prediction of tropical maize linesAuthor
MORAIS, PEDRO - Universidade Federal De Vicosa | |
AKDEMIR, DENIZ - Michigan State University | |
ROGERIO BRAATZ DE AN, LUCIANO - Universidade Federal De Vicosa | |
Jannink, Jean-Luc | |
FRITSCHE-NETO, ROBERTO - Universidade De Sao Paulo | |
BOREM, ALUIZIO - Universidade Federal De Vicosa | |
ALVEZ, FILIPE - Universidade De Sao Paulo | |
LYRA, DANILO - Universidade De Sao Paulo | |
GRANATO, ITALO - Universidade De Sao Paulo |
Submitted to: Plant Breeding
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 4/4/2020 Publication Date: 8/9/2020 Citation: Morais, P., Akdemir, D., Rogerio Braatz De Andrade, L., Jannink, J., Fritsche-Neto, R., Borem, A., Alvez, F.C., Lyra, D.H., Granato, I.S. 2020. Using public databases for genomic prediction of tropical maize lines. Plant Breeding. 139(4):697-707. https://doi.org/10.1111/pbr.12827. DOI: https://doi.org/10.1111/pbr.12827 Interpretive Summary: Public databases contain a wealth of genetic and evaluation data that is openly available. We tested the usefulness of this data to predict genetic values for tropical maize inbred lines regarding plant and ear height. We identified how the population structure, the use of optimized training sets (OTSs) and the amount of information originating from public databases affected prediction accuracy. In total, 29 training sets (TSs) were defined considering diversity panels from the University of São Paulo and the USDA North Central Regional Plant Introduction Station. These TSs were divided into four scenarios with different configurations. We showed that it is possible to use public datasets as a primary TS and that population structure can modify the predictive abilities of GS. In the four scenarios proposed, very large or very small sets did not provide predictive abilities over 0.53 for GS. However, OTSs composed of 250 individuals were sufficient to achieve predictive abilities over this limit. These results provide a rationale for the continued funding of public databasing efforts. Technical Abstract: In this paper, the aims were (a) to test the usefulness of using genomic and phenotypic information from public databases (open access) to predict genetic values for tropical maize inbred lines regarding plant and ear height; (b) to identify how the population structure, the use of optimized training sets (OTSs) and the amount of information originating from public databases affect the predictive ability. Thus, 29 training sets (TSs) were defined considering three diversity panels: the University of São Paulo (USP—validation set (VS)) and the ASSO and USDA North Central Regional Plant Introduction Station (NCRPIS) (external public panels—predictors), which were divided into four scenarios with different TS configurations. We showed that it is possible to use public datasets as a primary TS and that population structure can modify the predictive abilities of GS. In the four scenarios proposed, very large or very small sets did not provide predictive abilities over 0.53 for GS. However, OTSs composed of 250 individuals were sufficient to achieve predictive abilities over this limit. |