Skip to main content
ARS Home » Research » Publications at this Location » Publication #227445

Title: Prediction of spatial soil property information from ancillary sensor data using ordinary linear regression: Model derivations, residual assumptions and model validation tests

Author
item LESCH, SCOTT - UC RIVERSIDE
item Corwin, Dennis

Submitted to: Geoderma
Publication Type: Peer Reviewed Journal
Publication Acceptance Date: 9/16/2008
Publication Date: 10/25/2008
Citation: Lesch, S.M., Corwin, D.L. 2008. Prediction of spatial soil property information from ancillary sensor data using ordinary linear regression: Model derivations, residual assumptions and model validation tests. Geoderma. 148:130-140.

Interpretive Summary: Geospatial measurements of ancillary sensor data, such as bulk soil electrical conductivity or remotely sensed imagery data, are commonly used to characterize spatial variation in soil or crop properties. In this article we review the connection between the ordinary linear regression model and the more comprehensive geostatistical mixed linear model and describe when and under what conditions ordinary linear regression models represent valid spatial prediction models (for calibrating sensor data to soil properties). We derive the simplified formulas for the ordinary linear regression model parameter estimates and best linear unbiased predictions from the geostatistical mixed linear model under two different residual error assumptions; i.e., strictly uncorrelated (SU) residuals and effectively uncorrelated (EU) residuals. The EU formulas are examined in detail, since this latter assumption is much more likely to hold in practice. Statistical tests for detecting spatial correlation in LR model residuals are also reviewed, in addition to three LR model validation tests derived from classical linear modeling theory. Our primary results show that ordinary LR models can be used as valid and cost effective spatial prediction models, provided that at least the EU residual assumption is satisfied. These statistical results directly impact USDA scientists using linear modeling techniques to calibrate airborne or ground based remotely sensed data. Additionally, the latter model validation tests can be effectively used to independently test a typical LR calibration model, in addition to assessing the suitability of non-random, prediction based sampling plans (like the sample plan used in the USDA-ARS ESAP software package).

Technical Abstract: Geospatial measurements of ancillary sensor data, such as bulk soil electrical conductivity or remotely sensed imagery data, are commonly used to characterize spatial variation in soil or crop properties. Geostatistical techniques like kriging with external drift or regression kriging are often used to calibrate geospatial sensor data to specific soil or crop properties. More traditional statistical methods such as ordinary linear regression models are also commonly used. Unfortunately, some soil scientists see these as competing and unrelated modeling approaches and are unaware of their relationship. In this article we review the connection between the ordinary linear regression model and the more comprehensive geostatistical mixed linear model and describe when and under what conditions ordinary linear regression models represent valid spatial prediction models. The formulas for the ordinary linear regression model parameter estimates and best linear unbiased predictions are derived from the geostatistical mixed linear model under two different residual error assumptions; i.e., strictly uncorrelated (SU) residuals and effectively uncorrelated (EU) residuals. The theoretically optimal (best linear unbiased) and computable (linear unbiased) predictions and variance estimates derived under the EU error assumption are examined in detail. Statistical tests for detecting spatial correlation in LR model residuals are also reviewed, in addition to three LR model validation tests derived from classical linear modeling theory. Two case studies are presented that highlight and demonstrate the various parameter estimation, response variable prediction and model validation techniques discussed in this article.