Skip to main content
ARS Home » Research » Publications at this Location » Publication #105571

Title: ESTIMATING MISSING WEATHER DATA FOR AGRICULTURAL SIMULATIONS USING GROUP METHOD OF DATA HANDLING

Author
item Acock, Mary
item PACHEPSKY, YAKOV - DUKE UNIVERSITY

Submitted to: Journal of Applied Meterology
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 2/1/2000
Publication Date: 6/1/2000
Citation: N/A

Interpretive Summary: Whether the harvest is good or bad frequently, depends on weather conditions. Some farmers have weather stations in their fields to provide them with specific information they need to manage their crops. Invariably, the automatic data collecting system fails and data sets are not complete. This failure complicates the use of complex programs used by farmers to make management decisions because these programs rely on daily or hourly weather data. This paper examines several methods for filling in the missing data. One day's weather data was removed from each of 1400 sequential seven day data sets. The missing data were estimated using several methods. The climatological method estimated the missing data from data recorded on the same day of previous years at a nearby location. The persistence method estimated the missing data from data taken just before and just after the missing data. The group handling method examined correlation between numerous weather variables before and after the missing variable. When the estimates were compared with actual data, the group method of data handling was the most accurate, followed by the persistence method, and finally the climatological method.

Technical Abstract: Techniques are needed to estimate weather variables for days when the data are absent. We hypothesized such estimations can be made using data, before and after the day, with no data. To find and express these dependencies, we used group method of data handling (GMDH) which is a tool for modeling complex input-output relationships, by building hierarchical polynomial regression networks. We extracted 1400 sequential seven day data sets from the Stoneville (MS) data base. For each data set, we assumed weather variables on the fourth day were unknown and had to be found from the weather variables of days 1,2,3,5,6, and 7. We used 75 percent of these data to find the hierarchical polynomial regression and 25 percent to evaluate it. Values of R2 were about 0.88 for minimum temperature, 0.80 for maximum temperature, and 0.80 for wind run. Accuracy of the solar radiation and precipitation estimates was lower, R2 was about 0.2 - 0.3, but improved to 0.5 - 0.6 for the training data set and 0.3 for the validation data set for both variables when an additional indicator variable showing the presence or the absence of rain was included. The next day after GMDH can be a useful tool in filling gaps in weather data from weather stations installed in the fields.