Location: Hydrology and Remote Sensing Laboratory
Title: Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US MidwestAuthor
KANG, Y. - University Of Wisconsin | |
OZDOGAN, M. - University Of Wisconsin | |
ZHU, X. - University Of Wisconsin | |
YE, Z. - University Of Wisconsin | |
HAIN, C. - Nasa Marshall Space Flight Center | |
Anderson, Martha |
Submitted to: Environmental Research Letters
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 4/28/2020 Publication Date: 5/19/2020 Citation: Kang, Y., Ozdogan, M., Zhu, X., Ye, Z., Hain, C., Anderson, M.C. 2020. Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US Midwest. Environmental Research Letters. 15:064005. https://doi.org/10.1088/1748-9326/ab7df9. DOI: https://doi.org/10.1088/1748-9326/ab7df9 Interpretive Summary: Foreign and domestic yield estimation is a critical function of the USDA. Traditionally based on weather observations, in the past few decades satellite imagery and other types of geospatial data have played an increasingly important role in monitoring and forecasting yield. Given the operational cost of ingesting new datasets into existing monitoring systems, it is important to be able to identify which data are of most value for a given crop and region. This paper describes a machine learning approach for testing the relative value of an extensive set of weather data, satellite observations, land-surface models, soil maps, and crop progress reports in predicting corn yields over the U.S. Midwest. Of highest value were multi-wavelength satellite indices describing vegetation green chlorophyll content and biomass. Other satellite indices describing crop water stress and moisture availability also ranked highly among the variables tested. Studies like this help to inform improvements to current yield estimation strategies, focusing on new datasets that are demonstrated to add most significant value. Technical Abstract: Crop yield estimates over large areas are conventionally made using weather observations, but a comprehensive understanding of the effects of various environmental indicators and the choice of prediction algorithm remains elusive. Here we present a thorough assessment of county-level maize yield prediction in U.S. Midwest using six machine learning algorithms and an extensive set of environmental variables derived from satellite observations, weather data, land surface model results, soil maps, and crop progress reports. Results show that seasonal crop yield forecasting benefits from both more advanced algorithms and a large composite of information associated with crop canopy, weather, and soil (i.e. hundreds of features). Combining the best algorithm, inputs, and observation frequency improves the prediction accuracy by up to 7.9% compared to a baseline statistical model using only climatic and satellite observations. This study provides insights into practical crop yield forecasting and the understanding of yield response to climatic and environmental conditions. |