Location: Produce Safety and Microbiology Research
Title: Machine learning to attribute the source of Campylobacter infections in the United States: A retrospective analysis of national surveillance dataAuthor
PASCOE, BEN - Oxford University | |
FUTCHER, GEORGINA - University Of Bath | |
PENSAR, JOHAN - University Of Oslo | |
BAYLISS, SION - University Of Bristol | |
MOURKAS, EVANGELOS - Oxford University | |
CALLAND, JESSICA - University Of Oslo | |
HITCHINGS, MATTHEW - Swansea University | |
SIMMONS, MUSTAFA - Food Safety Inspection Service (FSIS) | |
JOSEPH, LAVIN - Centers For Disease Control And Prevention (CDC) - United States | |
LANE, CHARLOTTE - Centers For Disease Control And Prevention (CDC) - United States | |
GREENLEE, TIFFANY - Food And Drug Administration(FDA) | |
ARNING, NICOLAS - Oxford University | |
WILSON, DANIEL - Oxford University | |
CORANDER, JUKKA - University Of Oslo | |
Parker, Craig | |
COOPER, KERRY - University Of Arizona | |
ROSE, ERICA - Centers For Disease Control And Prevention (CDC) - United States | |
WILLIAMS, MICHAEL - Food Safety Inspection Service (FSIS) | |
GOLDEN, NEAL - Food Safety Inspection Service (FSIS) | |
HIETT, KELLI - Food And Drug Administration(FDA) | |
BRUCE, BEAU - Centers For Disease Control And Prevention (CDC) - United States | |
EVANS, PETER - Food Safety Inspection Service (FSIS) | |
SHEPPARD, SAMUEL - Oxford University |
Submitted to: Journal of Infection
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 8/30/2024 Publication Date: 9/7/2024 Citation: Pascoe, B., Futcher, G., Pensar, J., Bayliss, S.C., Mourkas, E., Calland, J.K., Hitchings, M.D., Simmons, M., Joseph, L.A., Lane, C.G., Greenlee, T., Arning, N., Wilson, D.J., Corander, J., Parker, C.T., Cooper, K.K., Rose, E., Williams, M.S., Golden, N.J., Hiett, K., Bruce, B.B., Evans, P.S., Sheppard, S.K. 2024. Machine learning to attribute the source of Campylobacter infections in the United States: A retrospective analysis of national surveillance data. Journal of Infection. 89(5). Article 106265. https://doi.org/10.1016/j.jinf.2024.106265. DOI: https://doi.org/10.1016/j.jinf.2024.106265 Interpretive Summary: Advanced bioinformatics methods, including machine learning and probabilistic models, were applied to large genome datasets of infectious pathogens to attribute the source of human infections and estimate the relative importance of different disease reservoirs. In this study, we used the two most common Campylobacter species in human gastrointestinal infection as model organisms to test the use of machine learning methods for probabilistic assignment of genome sequenced cases of campylobacteriosis in the United States between 2009 and 2019 to possible source reservoirs. These enteric bacteria are ubiquitous in the gut of wild and domestic birds, agricultural mammals and commonly infect humans via consumption of contaminated food. Rising incidence and antimicrobial resistance (AMR) are major concerns and there is an urgent need to quantify the main routes to human infection. Probabilistic attribution identified poultry as the primary infection source of human clinical isolates in the U.S. Fluoroquinolone and aminoglycoside resistant isolates drove an increase in multidrug resistant isolates identified in human infection cases, that could be attributed to chicken sources. National-scale surveillance and quantification of the relative contribution of infection reservoirs can guide policy. Our study suggests that the greatest reductions in human campylobacteriosis in the US will come from interventions that focus on poultry, which may also reduce the spread of AMR strains. Technical Abstract: Background The construction of large genome datasets of infectious pathogens in combination with advanced bioinformatics methods has the potential to inform public health risk and targeted intervention strategies. In this study, we use the two most common Campylobacter species in human gastrointestinal infection as model organisms to test the use of machine learning methods for probabilistic assignment of genome sequenced cases of campylobacteriosis in the United States between 2009 and 2019 to possible source reservoirs. These enteric bacteria are ubiquitous in the gut of wild and domestic birds, agricultural mammals and commonly infect humans via consumption of contaminated food. Rising incidence and antimicrobial resistance (AMR) are major concerns and there is an urgent need to quantify the main routes to human infection. Methods As part of routine US national surveillance (2009 through 2019), 8,889 Campylobacter isolate genomes were sequenced from human infections and 15,924 from possible sources. Targeting genetic variation associated with adaptation to the most recent host, we used machine learning and probabilistic models to attribute the source of human infections and estimate the relative importance of different disease reservoirs. Findings Probabilistic attribution identified poultry as the primary infection source of human clinical isolates, responsible for an estimated 72% of cases. Most of the remaining clinical isolates were attributed to cattle (25%), with only a small contribution from wild bird (2%) and pork sources (2%). Specifically, driven by an increase in fluroquinolone resistance in isolates that infect humans Fluoroquinolone and aminoglycoside resistant isolates drove an increase in multidrug resistant isolates identified in human infection cases, that could be attributed to chicken sources. Interpretation National-scale surveillance and quantification of the relative contribution of infection reservoirs can guide policy. Our study suggests that the greatest reductions in human campylobacteriosis in the US will come from interventions that focus on poultry, which may also reduce the spread of AMR strains. |