Publication : USDA ARS

ARS Home » Plains Area » Temple, Texas » Grassland Soil and Water Research Laboratory » Research » Publications at this Location » Publication #375733

Research Project: Contributions of Climate, Soils, Species Diversity, and Management to Sustainable Crop, Grassland, and Livestock Production Systems

Location: Grassland Soil and Water Research Laboratory

Title: Predicting carbon and water vapor fluxes using machine learning and novel feature ranking algorithms

Author

	CUI, XIA - Lanzhou University
	GOFF, THOMAS - Middle Tennessee State University
	CUI, SONG - Middle Tennessee State University
	Menefee, Dorothy
	QIANG, WU - Middle Tennessee State University
	RAJAN, NITHYA - Texas A&M University
	NAIR, SHYAM - Sam Houston State University
	PHILLIPS, NATE - Middle Tennessee State University
	WALKER, FORBES - University Of Tennessee

Submitted to: Science of the Total Environment
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 1/8/2021
Publication Date: 2/9/2021
Citation: Cui, X., Goff, T., Cui, S., Menefee, D.S., Qiang, W., Rajan, N., Nair, S., Phillips, N., Walker, F. 2021. Predicting carbon and water vapor fluxes using machine learning and novel feature ranking algorithms. Science of the Total Environment. 775. Article 145130. https://doi.org/10.1016/j.scitotenv.2021.145130.
DOI: https://doi.org/10.1016/j.scitotenv.2021.145130

Interpretive Summary: Machine learning techniques were used with Eddy Covariance data to estimate carbon fluxes and evapotranspiration. Type of training data used had a significant impact on model performance. The Sliced Inverse Regression-based Recursive Feature Elimination (SIRRFE) model had the best performance of the studied methods.

Technical Abstract: Gap-filling eddy covariance flux data using quantitative approaches has increased over the past decade. Numerous methods have been proposed previously, including look-up table approaches, parametric methods, process-based models, and machine learning. Particularly, the REddyProc package from the Max Planck Institute for Biogeochemistry and ONEFlux package from AmeriFlux have been widely used in many studies. However, there is no consensus regarding the optimal model and feature selection method that could be used for predicting different flux targets (Net Ecosystem Exchange, NEE; or Evapotranspiration –ET), due to the limited systematic comparative research based on the identical site-data. Here, we compared NEE and ET gap-filling/prediction performance of the least-square-based linear model, artificial neural network, and support vector machine (SVM) using data obtained from four major row-crop and forage agroecosystems located in the subtropical or the climate-transition zones in the US. Additionally, we tested the impacts of different training-testing data partitioning settings, including a 10-fold time-series sequential (10FTS), a 10-fold cross validation (CV) routine with single data point (10FCV), daily (10FCVD), weekly (10FCVW) and monthly (10FCVM) gap length, and a 7/14-day flanking window (FW) approach; and implemented a novel Sliced Inverse Regression-based Recursive Feature Elimination algorithm (SIRRFE). We benchmarked the model performance against REddyProc and ONEFlux-produced results. Our results indicated that accurate NEE and ET prediction models could be systematically constructed using SVM and only a few top informative features. The gap-filling performance of ONEFlux is generally satisfactory (R2 = 0.39-0.71), but results from REddyProc could be very limited or even unreliable in many cases (R2 = 0.01-0.67). Overall, SIRRFE-refined SVM models yielded excellent results for predicting NEE (R2 = 0.46-0.92) and ET (R2 = 0.74-0.91). Finally, the performance of various models was greatly affected by the types of ecosystem, predicting targets, and training algorithms; but was insensitive towards training-testing partitioning. Our research provided more insights into constructing novel gap-filling models and understanding the underlying drivers affecting boundary layer carbon/water fluxes on an ecosystem level.

U.S. DEPARTMENT OF AGRICULTURE

Grassland Soil and Water Research Laboratory: Temple, TX