Publication : USDA ARS

ARS Home » Plains Area » Houston, Texas » Children's Nutrition Research Center » Research » Publications at this Location » Publication #403009

Research Project: Preventing the Development of Childhood Obesity

Location: Children's Nutrition Research Center

Title: Egocentric image captioning for privacy-preserved passive dietary intake monitoring

Author

	QIU, JIANING - Imperial College
	LO, FRANK - Imperial College
	GU, XIAO - Imperial College
	JOBARTEH, MODOU - Imperial College
	JIA, WENYAN - University Of Pittsburgh
	BARANOWSKI, TOM - Children'S Nutrition Research Center (CNRC)
	STEINER-ASIEDU, MATILDA - University Of Ghana
	ANDERSON, ALEX - University Of Georgia
	MCCRORY, MEGAN - Boston University
	SAZONOV, EDWARD - University Of Alabama
	SUN, MINGUI - University Of Pittsburgh
	FROST, GARY - Imperial College
	LO, BENNY - Imperial College

Submitted to: IEEE Transactions on Cybernetics
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 1/27/2023
Publication Date: 3/6/2023
Citation: Qiu, J., Lo, F.P., Gu, X., Jobarteh, M.L., Jia, W., Baranowski, T., Steiner-Asiedu, M., Anderson, A.K., McCrory, M.A., Sazonov, E., Sun, M., Frost, G., Lo, B. 2023. Egocentric image captioning for privacy-preserved passive dietary intake monitoring. IEEE Transactions on Cybernetics. https://doi.org/10.1109/TCYB.2023.3243999.
DOI: https://doi.org/10.1109/TCYB.2023.3243999

Interpretive Summary: Self-report of dietary intake is highly inaccurate, so a more accurate method is needed. Camera-based passive dietary intake monitoring can continuously capture the eating episodes of a participant, but there is no method able to provide a comprehensive context (e.g., is the participant sharing food with others, what food the subject is eating, and how much food is left in the bowl) and protect privacy. In this article, a privacy-preserved secure solution (i.e., egocentric image captioning) for dietary assessment with passive monitoring is presented that unifies food recognition, volume estimation, and scene understanding. By converting images into rich text descriptions, nutritionists can assess individual dietary intake based on the captions instead of the original images, reducing the risk of privacy leakage from images. An egocentric dietary image captioning dataset has been built, which consists of real life images captured by head worn and chest-worn cameras in field studies in Ghana. Comprehensive experiments have been conducted to evaluate the effectiveness and to justify the design of the proposed architecture.

Technical Abstract: Camera-based passive dietary intake monitoring is able to continuously capture the eating episodes of a subject, recording rich visual information, such as the type and volume of food being consumed, as well as the eating behaviors of the subject. However, there currently is no method that is able to incorporate these visual clues and provide a comprehensive context of dietary intake from passive recording (e.g., is the subject sharing food with others, what food the subject is eating, and how much food is left in the bowl). On the other hand, privacy is a major concern while egocentric wearable cameras are used for capturing. In this article, we propose a privacy-preserved secure solution (i.e., egocentric image captioning) for dietary assessment with passive monitoring, which unifies food recognition, volume estimation, and scene understanding. By converting images into rich text descriptions, nutritionists can assess individual dietary intake based on the captions instead of the original images, reducing the risk of privacy leakage from images. To this end, an egocentric dietary image captioning dataset has been built, which consists of in-the-wild images captured by head-worn and chest-worn cameras in field studies in Ghana. A novel transformer-based architecture is designed to caption egocentric dietary images. Comprehensive experiments have been conducted to evaluate the effectiveness and to justify the design of the proposed architecture for egocentric dietary image captioning. To the best of our knowledge, this is the first work that applies image captioning for dietary intake assessment in real-life settings.

U.S. DEPARTMENT OF AGRICULTURE

Children's Nutrition Research Center: Houston, TX