Skip to main content
ARS Home » Northeast Area » Ithaca, New York » Robert W. Holley Center for Agriculture & Health » Plant, Soil and Nutrition Research » Research » Publications at this Location » Publication #377912

Research Project: Database Tools for Managing and Analyzing Big Data Sets to Enhance Small Grains Breeding

Location: Plant, Soil and Nutrition Research

Title: Translating insights from the seed metabolome into improved prediction for healthful compounds in oat (Avena sativa L.)

Author
item CAMPBELL, MALACHY - Cornell University
item HU, HAIXIAO - Cornell University
item YEATS, TREVOR - Cornell University
item CAFFE-TREML, MELANIE - South Dakota State University
item GUTIERREZ, LUCIA - University Of Wisconsin
item SMITH, KEVIN - University Of Minnesota
item SORRELLS, MARK - Cornell University
item GORE, MICHAEL - Cornell University
item Jannink, Jean-Luc

Submitted to: bioRxiv
Publication Type: Other
Publication Acceptance Date: 7/20/2020
Publication Date: 7/2/2020
Citation: Campbell, M.T., Hu, H., Yeats, T.H., Caffe-Treml, M., Gutierrez, L., Smith, K.P., Sorrells, M.E., Gore, M., Jannink, J. 2020. Translating insights from the seed metabolome into improved prediction for healthful compounds in oat (Avena sativa L.). bioRxiv. https://doi.org/10.1101/2020.07.06.190512.
DOI: https://doi.org/10.1101/2020.07.06.190512

Interpretive Summary: Oat seed is a rich resource of beneficial oils, fiber, protein, and antioxidants, and is considered a healthful food for humans. Little is known, however, regarding genes that control these compounds in oat seed. Furthermore, oat breeders need to be able to predict oat seed composition using DNA marker data to make rapid progress breeding more nutritious oats. “Metabolomics” is a technique that quantifies many compounds (metabolites) in a sample. We used this technique to characterize variation in the mature seed of 367 diverse oat varieties. We leveraged this information to improve prediction for seed quality traits from DNA markers. Metabolomics provides measures of the quantity of hundreds of compounds. We identified factors that cause correlation among compounds. Many factors (21%) were enriched for compounds associated with oil metabolism. Factors that affected many compounds also were generally influenced by many genes. We found DNA markers that were significantly associated with 23% of the factors. We fed this information into statistical prediction models that were able to predict seed lipid and protein traits in two independent studies. Predictions for eight of the 12 traits were significantly improved compared to standard prediction methods. This study provides new insights into variation in oat seed composition and provides genomic resources for breeders to improve selection for health-promoting seed quality traits.

Technical Abstract: Oat (Avena sativa L.) seed is a rich resource of beneficial lipids, soluble fiber, protein, and antioxidants, and is considered a healthful food for humans. Despite these characteristics, little is known regarding the genetic controllers of variation for these compounds in oat seed. We sought to characterize natural variation in the mature seed metabolome using untargeted metabolomics on 367 diverse lines and leverage this information to improve prediction for seed quality traits. We used a latent factor approach to define unobserved variables that may drive covariance among metabolites. One hundred latent factors were identified, of which 21% were enriched for compounds associated with lipid metabolism. Through a combination of whole-genome regression and association mapping, we show that latent factors that generate covariance for many metabolites tend to have a complex genetic architecture. Nonetheless, we recovered significant associations for 23% of the latent factors. These associations were used to inform a multi-kernel genomic prediction model, which was used to predict seed lipid and protein traits in two independent studies. Predictions for eight of the 12 traits were significantly improved compared to genomic best linear unbiased prediction when this prediction model was informed using associations from lipid-enriched factors. This study provides new insights into variation in the oat seed metabolome and provides genomic resources for breeders to improve selection for health-promoting seed quality traits. More broadly, we outline an approach to distill high-dimensional ‘omics’ data to a set of biologically-meaningful variables and translate inferences on these data into improved breeding decisions.