Author
Whittaker, Gerald | |
CONFESOR, REMEGIO - OSU | |
DILUZIO, MAURO - Texas A&M University | |
Arnold, Jeffrey |
Submitted to: Transactions of the ASABE
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 11/22/2009 Publication Date: 8/15/2010 Citation: Whittaker, G.W., Confesor, R., DiLuzio, M.D., Arnold, J.G. 2010. Detection of overparameterization and overfitting in an automatic calibration of SWAT. Transactions of the ASABE. 53:1487-1499. Interpretive Summary: With increasing computational power, automatic calibration of hydrologic models presents an attractive alternative to manual, expert knowledge based calibration. The use of parallel computing allows the simultaneous calibration of a large number of model parameters at once. The use of many parameters in calibration could achieve a good calibration, but be a non-unique set of parameters. A non-unique set of calibrated parameters could lead to un-diagnosed problems in hydrologic simulation. Using simple statistical methods, we determine that for the Blue River Watershed in Oklahoma, a large number (4,198) of parameters could be usefully used to calibrate the SWAT model to simulate river flow. Technical Abstract: With increasing computational power, automatic calibration of hydrologic models presents an attractive alternative to manual, expert knowledge based calibration. The use of parallel computing allows the simultaneous calibration of a large number of model parameters at once. The use of many parameters in calibration could achieve a good calibration, but be a non-unique set of parameters. A non-unique set of calibrated parameters could lead to un-diagnosed problems in hydrologic simulation. The general term for this is identification, or lack thereof, of the calibrated parameters. Lack of identification occurs in the estimation of model parameters for a given model structure when different parameter values result in identical solutions to the model. Where there is lack of identification, consistent estimation of parameters is not possible. Methods for detection of the two aspects of parameter identification in hydrologic model calibration using a genetic algorithm were applied to the Blue Watershed in Oklahoma using 4,198 parameters for calibration in the Distributed Model Intercomparison Project of the National Weather Service. Using the information provided by evolution of parameter distributions, parameter sensitivity was characterized by a statistical comparison of distributions through generations of the genetic algorithm. A screening procedure using linear discriminant analysis was unable to detect any variables that had not been identified in the calibration. Use of a genetic algorithm to search for alternative sets of parameters that gave the same simulated flow was able to find the known values of a synthetic simulation using known values. It did not find any parameter sets that differed more than a very small fraction from the known values, indicating that the calibration variables were identified for both sensitivity and collinearity. |