Publication : USDA ARS

ARS Home » Northeast Area » Beltsville, Maryland (BARC) » Beltsville Agricultural Research Center » Environmental Microbial & Food Safety Laboratory » Research » Publications at this Location » Publication #400954

Research Project: Improving Pre-harvest Produce Safety through Reduction of Pathogen Levels in Agricultural Environments and Development and Validation of Farm-Scale Microbial Quality Model for Irrigation Water Sources

Location: Environmental Microbial & Food Safety Laboratory

Title: AI4Water v1.0: an open-source python package for modeling hydrological time series using data-driven methods

Author

	PYO, JONG CHEOL - Pusan National University
	Pachepsky, Yakov
	KIM, SOOBIN - Ulsan National Institute Of Science And Technology (UNIST)
	ABBAS, ATHER - Ulsan National Institute Of Science And Technology (UNIST)
	KIM, MINJEONG - Korea Atomic Energy Research Institute (KAERI)
	KWON, JONGSUN - Korea Atomic Energy Research Institute (KAERI)
	LIGARAY, MAYZONEE - Collaborator
	CHO, KYUNGHWA - Ulsan National Institute Of Science And Technology (UNIST)

Submitted to: Geoscientific Model Development
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/11/2022
Publication Date: 8/8/2022
Citation: Pyo, J., Pachepsky, Y.A., Kim, S., Abbas, A., Kim, M., Kwon, J., Ligaray, M., Cho, K. 2022. AI4Water v1.0: an open-source python package for modeling hydrological time series using data-driven methods. Geoscientific Model Development. 15(7):3021-3039.

Interpretive Summary: Progress in monitoring technologies has resulted in accumulating long-term observations from permanently installed sensors. Using such data for predictions has long been viewed as the essential goal. Predicted measurements often depended on recent observations and observations made in the distant past. Whereas standard statistical regression methods were not helpful for predictions in such situations, one artificial intelligence method, long short-term memory modeling (LSTM), appeared to be efficient. LSTM recently became popular in water quality research and applications. Our review showed that LSTM becomes more accurate when combined with other machine learning methods, such as convolutional neural networks and attention networks. LSTM is useful in estimating mission data and may benefit from data preprocessing. Utilizing site specific static information about the environmental settings holds promises. This work will be helpful to researchers and practitioners creating and using long sequences of ecological measurements, including water quality parameters.

Technical Abstract: Machine learning has shown great promise for simulating hydrological phenomena. However, the development of machine-learning-based hydrological models requires advanced skills from diverse fields, such as programming and hydrological modeling. Additionally, data preprocessing and post-processing when training and testing machine learning models are a time-intensive process. In this study, we developed a python-based framework that simplifies the process of building and training machine learning-based hydrological models and automates the process of pre-processing hydrological data and post-processing model results. Pre-processing utilities assist in incorporating domain knowledge of hydrology in the machine learning model, such as the distribution of weather data into hydrologic response units (HRUs) based on different HRU discretization definitions. The post-processing utilities help in interpreting the model’s results from a hydrological point of view. This framework will help increase the application of machine-learning-based modeling approaches in hydrological sciences.

U.S. DEPARTMENT OF AGRICULTURE

Environmental Microbial & Food Safety Laboratory: Beltsville, MD