Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BHNRC) » Beltsville Human Nutrition Research Center » Food Components and Health Laboratory » Research » Publications at this Location » Publication #398932

Research Project: Strategies to Alter Dietary Food Components and Their Effects on Food Choice and Health-Related Outcomes

Location: Food Components and Health Laboratory

Title: Fecal metagenomics to identify biomarkers of food intake in healthy adults: findings from randomized, controlled, nutrition trials

Author
item SHINN, LEILA - University Of Illinois
item MANSHARAMANI, ADITYA - University Of Illinois
item Baer, David
item Novotny, Janet
item CHARRON, CRAIG - Retired ARS Employee
item KAHN, NAIMAN - University Of Illinois
item ZHU, RUOQING - University Of Illinois
item HOLSCHER, HANNAH - University Of Illinois

Submitted to: Journal of Nutrition
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/3/2023
Publication Date: 11/8/2023
Citation: Shinn, L.M., Mansharamani, A., Baer, D.J., Novotny, J., Charron, C.S., Kahn, N.A., Zhu, R., Holscher, H.D. 2024. Fecal metagenomics to identify biomarkers of food intake in healthy adults: findings from randomized, controlled, nutrition trials. Journal of Nutrition. 154:271-283.
DOI: https://doi.org/10.1016/j.tjnut.2023.11.001

Interpretive Summary: While self-reported measures of food intake and compliance are frequently utilized in nutrition studies, they are limited by their reliability and validity. The discovery, development, and use of biomarkers of food intake are needed. Metagenomic analyses not only reveals which bacterial microorganisms are present in a given sample, but also provides data on their encoded functions which provides additional insights into the functional capacity of the microbiome. Many metagenomic biomarker discovery studies have been specific to disease. Yet, another promising application for these discoveries is to complement self-reported measures of food intake and compliance with fecal genes, and subsequent pathways as objective biomarkers of food intake. The purpose of the present investigation was to utilize a computationally intensive, multivariate, machine learning approach to identify fecal genes and genomes that accurately predict food intake. This project was a secondary analysis conducted on data generated from fecal samples collected at pre- and post-intervention of 5 dietary interventions which manipulated the specific foods including almonds, avocados, broccoli, walnuts, and whole grains. This effort, which utilized random forest to identify food intake biomarkers, revealed high predictive accuracy of almond, broccoli, and walnut intake, both individually (compared to respective controls) and in a mixed-food model (almond versus broccoli versus walnut). Further, we were able to identify differentially expressed features across all three food groups. These findings reveal the promise of metagenomics in establishing fecal bacterial genes as biomarkers of food intake to objectively complement self-reported food measures and study compliance.

Technical Abstract: Undigested components of the human diet affect the composition and function of the microorganisms present in the gastrointestinal tract. Techniques like metagenomic analyses allow researchers to study functional capacity, thus revealing the potential of using metagenomic data for developing objective biomarkers of food intake. As a continuation of our previous work using 16S and metabolomic datasets, we aimed to utilize a computationally intensive, multivariate, machine-learning approach to identify fecal KEGG (Kyoto encyclopedia of genes and genomes) Orthology (KO) categories as biomarkers that accurately classify food intake. Data were aggregated from 5 controlled feeding studies that studied the individual impact of almonds, avocados, broccoli, walnuts, barley, and oats on the adult gastrointestinal microbiota. Deoxyribonucleic acid from preintervention and postintervention fecal samples underwent shotgun genomic sequencing. After preprocessing, sequences were aligned and functionally annotated with Double Index AlignMent Of Next-generation sequencing Data v2.0.11.149 and MEtaGenome ANalyzer v6.12.2, respectively. After the count normalization, the log of the fold change ratio for resulting KOs between pre- and postintervention of the treatment group against its corresponding control was utilized to conduct differential abundance analysis. Differentially abundant KOs were used to train machine-learning models examining potential biomarkers in both single-food and multi-food models. We identified differentially abundant KOs in the almond (n = 54), broccoli (n = 2474), and walnut (n = 732) groups (q < 0.20), which demonstrated classification accuracies of 80%, 87%, and 86% for the almond, broccoli, and walnut groups using a random forest model to classify food intake into each food group’s respective treatment and control arms, respectively. The mixed-food random forest achieved 81% accuracy. Our findings reveal promise in utilizing fecal metagenomics to objectively complement self-reported measures of food intake. Future research on various foods and dietary patterns will expand these exploratory analyses for eventual use in feeding study compliance and clinical settings.