Skip to main content
ARS Home » Pacific West Area » Davis, California » Sustainable Agricultural Water Systems Research » Research » Publications at this Location » Publication #397932

Research Project: Improved Agroecosystem Efficiency and Sustainability in a Changing Environment

Location: Sustainable Agricultural Water Systems Research

Title: Prediction of attachment efficiency using machine learning on a comprehensive database and its validation

Author
item GOMEZ-FLORES, ALLAN - HANYANG UNIVERSITY
item Bradford, Scott
item CAI, LI - DONGHUA UNIVERSITY
item URÍK, MARTIN - COMENIUS UNIVERSITY IN BRATISLAVA
item KIM, HYUNJUNG - HANYANG UNIVERSITY

Submitted to: Water Research
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 11/20/2022
Publication Date: 11/25/2022
Citation: Gomez-Flores, A., Bradford, S.A., Cai, L., Urík, M., Kim, H. 2022. Prediction of attachment efficiency using machine learning on a comprehensive database and its validation. Water Research. 229. Article 119429. https://doi.org/10.1016/j.watres.2022.119429.
DOI: https://doi.org/10.1016/j.watres.2022.119429

Interpretive Summary: The ability to predict the fate of colloids, such as bacteria, virus, and clays, in soils and groundwater is important to protect water resources from contamination, but colloid attachment to solid surfaces depends on many factors that are difficult to quantify. A database was created from published literature and used to develop a function (e.g., a machine learning model) to predict the colloid sticking efficiency from experimental parameters. This function provided a good description of the literature results and demonstrated the relative importance of various input parameters, but showed some limitations in describing data for surface modified microplastics in the presence and absence of dissolved organic matter. This approach shows promise to quickly and easily determine the colloid sticking efficiency and will be of interest to scientist, engineers, health officials, and government regulators that are concerned about assessing the risks of colloid and colloid associated contaminants.

Technical Abstract: Colloidal particles can attach to surfaces during transport, but the attachment is a complex function of the particle size, hydrodynamics, solid and water chemistry, and particulate matter. The attachment is quantified in filtration theory by the attachment or sticking efficiency (Alpha). A comprehensive Alpha database (2538 records) was built from experiments in the literature and used to develop a machine learning (ML) model to predict Alpha. The training (r–squared: 0.86) was conducted using two random forests capable of handling missing data. A holdout dataset was used to validate the training (r–squared: 0.98), and variable importance was explored for training and validation. Finally, an additional validation dataset was built from quartz crystal microbalance experiments using surface–modified polystyrene, poly (methyl methacrylate), and polyethylene. The experiments were conducted in the absence and presence of humic acid. Full database regression (r–squared: 0.90) predicted Alpha for the additional validation with an r–squared of 0.23. Nevertheless, when the original database and the additional validation dataset were combined in a new database, both the training (r–squared: 0.95) and validation (r–squared: 0.70) increased. The developed ML model provides a valuable data–driven tool to predict Alpha over a big database and evaluate the significance of 22 input variables.