Skip to main content
ARS Home » Plains Area » El Reno, Oklahoma » Oklahoma and Central Plains Agricultural Research Center » Livestock, Forage and Pasture Management Research Unit » Research » Publications at this Location » Publication #415512

Research Project: Integrated Research to Enhance Forage and Food Production from Southern Great Plains Agroecosystems

Location: Livestock, Forage and Pasture Management Research Unit

Title: Evaluating machine learning algorithms to model time series of vegetation indices in tallgrass prairie

Author
item Wagle, Pradeep
item DANALA, GOPICHAND - University Of Oklahoma
item DONNER, CATHERINE - University Of Oklahoma
item XIAO, XIANGMING - University Of Oklahoma
item Moffet, Corey
item Gunter, Stacey
item WOFGANG, JENTNER - University Of Oklahoma
item EBERT, DAVIS - University Of Oklahoma

Submitted to: Global Journal of Agricultural and Allied Sciences
Publication Type: Abstract Only
Publication Acceptance Date: 12/15/2024
Publication Date: 12/30/2024
Citation: Wagle, P., Danala, G., Donner, C., Xiao, X., Moffet, C., Gunter, S.A., Wofgang, J., Ebert, D. 2024. Evaluating machine learning algorithms to model time series of vegetation indices in tallgrass prairie. Global Journal of Agricultural and Allied Sciences. 5(1):1-2. https://doi.org/10.35251/gjaas.2024.004.
DOI: https://doi.org/10.35251/gjaas.2024.004

Interpretive Summary:

Technical Abstract: Tallgrass prairie is one of the ecologically and economically important grassland ecosystems in the Great Plains of the United States of America (USA). A complex interplay of annual climatic conditions (e.g., temperature, solar radiation, and rainfall), plant species composition, geographic factors, disturbances, and management practices cause yearly variations in the timing of phenological events in tallgrass prairie. Unraveling the connection between climate and satellite-based vegetation indices (VIs) is key for predicting phenological events and productivity of tallgrass prairies under changing climate. We hypothesize that the complex and non-linear response of prairie vegetation to climate demands advanced learning algorithms to capture these intricacies accurately. Machine learning algorithms become powerful tools in phenology research to find patterns and relationships between climatic factors and VIs using historical data. The main objective of this study was to develop robust machine learning model(s) that can predict climate-induced phenological variability in tallgrass prairie by analyzing patterns of VIs and their climatic controls. Six machine learning algorithms [linear regression, eXtreme Gradient Boosting (XGBoost), random forest, decision tree, support vector regression, and K-nearest neighbors (KNN)] were compared for their performance in modeling patterns of the Moderate Resolution Imaging Spectroradiometer-derived enhanced vegetation index (EVI, greenness index – which can be used as a proxy of productivity/biomass) and land surface water index (LSWI – which can be used to track drought conditions and ecosystem’s health) of native tallgrass prairie in Central Oklahoma, USA. We divided the dataset into three parts: training, testing, and validation. We randomly divided the 2000-2021 data into an 80% training set and a 20% testing set using a time series split. To test the temporal transferability of the models, we further evaluated the performance of the models on a completely new unseen validation dataset (2022-2023) from the same native prairie pasture. The results showed that climate was a major driver of vegetation phenology in tallgrass prairie. Temperature is particularly important, as it influences the rate of plant development. Consequently, air and soil temperatures showed the highest correlations with EVI (r = 0.77) and LSWI (r = 0.56). Solar radiation also influences tallgrass prairie phenology. We observed low correlations (r = 0.23) of EVI and LSWI with contemporaneous rainfall or soil moisture suggesting vegetation's delayed response to these factors (i.e., vegetation responds to changes in rain or soil moisture with a time lag). The effects of other climatic factors such as relative humidity and wind speed were less pronounced. The study suggests climate change will likely have a significant impact on the vegetation phenology of tallgrass prairie. Decision tree, KNN, XGBoost, and random forest showed better performance (R2 = 0.94-1.0, RMSE = 0-0.032, and MAE = 0-0.024) to model EVI on the training dataset (Fig. 1). Linear regression and SVR models showed relatively weaker performance (R2 = 0.76-0.77). On the testing dataset, XGBoost, random forest, and KNN showed better performance (R2 = 0.80-0.83, RMSE = 0.055-0.06, and MAE = 0.042-0.046). Linear regression and SVR showed slightly weaker performance (R2 = 0.77-0.79) and the decision tree performed the worst (R2 = 0.65). On the validation dataset, XGBoost and random forest showed the best performance (R2 = 0.85, RMSE = 0.052-0.053, and MAE = 0.041), while linear regression, SVR, and KNN showed slightly weaker performance (R2 = 0.71-0.74, RMSE = 0.07-0.072, and MAE = 0.054-0.061). The decision tree showed the weakest performance (R2 = 0.65). On the training dataset, XGBoost, decision tree, and random forest showed better performance (R2 = 0.88