Location: Environmental Microbial & Food Safety Laboratory
Title: Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trendsAuthor
KARANTH, SARADDHA - University Of Maryland | |
Patel, Jitu | |
SHIRMOHAMMADI, ADEL - University Of Maryland | |
PRADHAN, ABANI - University Of Maryland |
Submitted to: Current Research in Food Science
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 5/28/2023 Publication Date: 6/2/2023 Citation: Karanth, S., Patel, J.R., Shirmohammadi, A., Pradhan, A. 2023. Machine learning to predict foodborne salmonellosis outbreaks based on genome characteristics and meteorological trends. Current Research in Food Science. https://doi.org/10.1016/j.crfs.2023.100525. DOI: https://doi.org/10.1016/j.crfs.2023.100525 Interpretive Summary: Studies have shown a correlation between climatic conditions and the severity of Salmonella infections associated with consumption of contaminated food. However, these studies lack the role of genetic heterogeneity and species variability of Salmonella enterica serovars. The effect of differential gene expression and meteorological factors on salmonellosis outbreak severity (typified by case numbers) was analyzed using a combination of machine learning and count modeling methods. Elastic Net regularization model identified 53 significant gene features. The final multi-variable Poisson regression model identified 127 significant predictor terms comprising 45 gene-only predictors, average temperature, average precipitation, and snow cover, and 79 gene-meteorological interaction terms. The significant genes ranged in functionality from cellular signaling and transport, virulence, metabolism, and stress response. Ambient temperature and precipitation also played a role individually and in combination with significant genes in predicting severity of Salmonella infections. Results are helpful to regulators and stakeholders to predict potential cases of Salmonella infections during an outbreak, and to evaluate the risk to human health. Technical Abstract: Several studies have shown a correlation between outbreaks of Salmonella enterica and climatological and meteorological trends, especially related to temperature and precipitation. Additionally, current outbreak-related studies are performed on data pooled by Salmonella species without considering its intra-species and genetic heterogeneity. In this study, we analyzed the effect of differential gene expression and a suite of meteorological factors on salmonellosis outbreak severity (typified by case numbers) using a combination of machine learning and count modeling methods. Elastic Net regularization was used to identify significant genes from a Salmonella pan-genome, and a multi-variable Poisson regression developed to fit the individual and mixed effects data. The best-fit Elastic Net model (a = 0.5000; ' = 2.18399) identified 53 significant gene features. The final multi-variable Poisson regression model ('2 = 5748.22; pseudo R2 = 0.6688; probability > '2 = 0.0000) identified 127 significant predictor terms (p < 0.10), comprising 45 gene-only predictors, average temperature, average precipitation, and snow cover, and 79 gene-meteorological interaction terms. The significant genes ranged in functionality from cellular signaling and transport, virulence, metabolism, and stress response, and included gene variables not considered as significant by the baseline model. The results of this study indicate the need to co-evaluate novel data with environmental data for developing a more holistic model to predict disease outcome severity, which could extend to re-evaluating the risk to human health. |