Skip to main content
ARS Home » Northeast Area » Beltsville, Maryland (BHNRC) » Beltsville Human Nutrition Research Center » Methods and Application of Food Composition Laboratory » Research » Publications at this Location » Publication #378567

Research Project: USDA National Nutrient Databank for Food Composition

Location: Methods and Application of Food Composition Laboratory

Title: Application of machine learning for predicting label nutrients using USDA Global Branded Food Products Database (BFPD)

Author
item MA, PEIHUA - University Of Maryland
item LI, AN - University Of Maryland
item YU, NING - University Of Maryland
item BAHADUR, RAHUL - University Of Maryland
item QIN, WANG - University Of Maryland
item LI, YING - University Of Maryland
item Ahuja, Jaspreet

Submitted to: Journal of Food Composition and Analysis
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/20/2021
Publication Date: 3/19/2021
Citation: Ma, P., Li, A., Yu, N., Bahadur, R., Qin, W., Li, Y., Ahuja, J.K. 2021. Application of machine learning for predicting label nutrients using USDA Global Branded Food Products Database (BFPD). Computers and Electronics in Agriculture. https://doi.org/10.1016/j.jfca.2021.103857.
DOI: https://doi.org/10.1016/j.jfca.2021.103857

Interpretive Summary: Automatic, accurate, and robust prediction of food attributes using emerging machine learning techniques including artificial intelligence may be helpful. In this paper, we for the first time, have evaluated 5 machine learning models for quantitatively predicting relationships between food ingredients and nutrients. Based on the USDA Global Branded Food Products Database (BFPD), we prepared a machine-readable dataset for two domains - nutrient and ingredients and investigated five prominent models, including AdaBoost, Bayesians, Linear Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and one tailored-improved MLP called MLPcr to predict 13 label nutrients included in BFPD based on their ingredients. We report on 3 varied nutrients in this paper – carbohydrates, protein, and sodium. Among them, the prediction based on the neural network model, MLP, and MLPcr achieved the most accuracy, as high as 0.900 for carbohydrates. This work could be potentially helpful for developing personalized food nutrient prediction software and illustrates potential use of the emerging field of artificial intelligence in food and nutrient research.

Technical Abstract: Automatic, accurate, and robust prediction of food attributes using emerging machine learning techniques including artificial intelligence may be helpful. In this paper we have, for the first time, evaluated 5 machine learning models for quantitatively predicting relationship between food ingredients and nutrients. Based on the USDA Global Branded Food Products Database (BFPD), we prepared a machine-readable dataset for two domains - nutrient and ingredients. Based on these datasets, we investigated five prominent models, including AdaBoost, Bayesians, Linear Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and one tailored-improved MLP called MLPcr to predict 13 label nutrients included in BFPD based on their ingredients. We report on 3 varied nutrients in this paper – carbohydrates, protein, and sodium. The prediction based on the neural network model, MLP, and MLPcr achieved the most accuracy, as high as 0.900 for carbohydrates. A detailed evaluation of the prediction results found that the data distribution and multi-factor complexity have an essential impact on the accuracy of the final prediction. Our research explores the possibility of using neural networks for prediction of nutrients using ingredients of foods, as well as potential use of neural network applications to the broader scope of food research.