Location: Environmental Microbial & Food Safety Laboratory
Title: Estimating phytoplankton concentrations in agricultural irrigation ponds from water quality measurements: a machine learning applicationAuthor
SMITH, JACLYN - Orise Fellow | |
HILL, ROBERT - University Of Maryland | |
WOLNY, JENNIFER - Food And Drug Administration(FDA) | |
STOCKER, MATTHEW - Orise Fellow | |
Pachepsky, Yakov |
Submitted to: Journal of Phycology
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/9/2022 Publication Date: 11/12/2022 Citation: Smith, J., Hill, R., Wolny, J., Stocker, M., Pachepsky, Y.A. 2022. Estimating phytoplankton concentrations in agricultural irrigation ponds from water quality measurements: a machine learning application. Journal of Phycology. 9(11):142. Interpretive Summary: Phytoplankton, i.e. microscopic algae and cyanobacteria, strongly influences water quality in freshwater sources. Using physicochemical water quality parameters to estimate algae concentrations can be beneficial because of the relative ease of measuring water quality parameters compared with algae concentrations. However, such estimation has been difficult because of the complexity and diversity of interactions in natural waters. Recently, machine learning as the branch of artificial intelligence has successfully discovered and simulated complex relationships. We intensively monitored irrigation ponds in Maryland for three years and used the machine learning algorithm random forest to establish models to estimate concentrations of major groups of phytoplankton organisms from water quality parameters. The established models were more accurate in the interior of ponds where the bulk of irrigation water was stored. It was sufficient to use affordable and fast sensor measurements as inputs for the phytoplankton concentration assessment. Results of this work can be used by irrigation water managers and consultants in that they indicate the opportunity for evaluating potentially toxic harmful algal blooms in irrigation water sources using modern fast measurement technologies for survey and monitoring. Technical Abstract: Phytoplankton community composition has been utilized for water quality assessment purposes in various freshwater sources (i.e., lakes, ponds, reservoirs), but studies are lacking on agricultural irrigation ponds. Since phytoplankton identification and enumeration are time consuming and expensive processes, the use of machine learning to predict and forecast phytoplankton populations could be beneficial to water quality management and managers. The objective of this work was to evaluate the performance of the random forest algorithm in estimating phytoplankton community structure from in situ water quality measurements of different complexities at two agricultural irrigation ponds. Spatially intensive sampling was performed between 2017 and 2019 and measurements of three phytoplankton functional groups (green algae, diatoms, and cyanobacteria) and three sets of water quality parameters (physio-chemical, organic constituents, and nutrient parameters) were obtained. The random forest algorithm was utilized to create a model and estimate each phytoplankton group with water quality parameters as inputs. The green algae models had superior performance to the diatom and cyanobacteria models as measured by root mean square errors (RMSE). Spatial model performance results revealed that interior waters tended to have the lower RMSE values when compared to nearshore waters. Furthermore, model performance did not drastically change when additional input sets were added and basic physico-chemical parameters, which can be obtained easily and affordably in real time, outperformed organic constituent and nutrient parameters. This study indicates that the random forest algorithm is useful for the prediction of major phytoplankton functional groups in agricultural irrigation ponds and could be beneficial for water quality management applications. |