Author
CHIANG, KUO-SZU - Chung Hsing University | |
Bock, Clive | |
EL JARROUDI, MOUSSA - University Of Liege | |
DELFOSSE, PHILIPPE - Centre De Recherche Public - Gabriel Lippmann | |
LEE, I. - Chung Hsing University | |
LIU, H. - Chung Hsing University |
Submitted to: Plant Pathology
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 4/7/2015 Publication Date: 5/1/2016 Citation: Chiang, K., Bock, C.H., El Jarroudi, M., Delfosse, P., Lee, I.H., Liu, H.I. 2016. The effects of rater bias and assessment method used to estimate disease severity on hypothesis testing. Plant Pathology. 65(4):523-535. Interpretive Summary: The effects of bias (over and underestimates) in estimates of disease severity on hypothesis testing using different assessment methods was compared. Nearest percent estimates (NPE), the Horsfall-Barratt (H-B) scale, and two different linear category scales (10% increments, with and without additional grades at low severity) were compared using simulation modeling. The power of the H-B scale and the 10% scale were least for correctly testing an hypothesis compared with the other methods. The amended 10% category scale was most often superior to other methods at all severities tested, and is thus preferred if raters need to estimate severity on a disease scale. Rater bias and assessment method had little effect on error rates, however, the power of the hypothesis test using unbiased estimates was most often greater compared with biased estimates, regardless of assessment method. Knowledge of the effects of rater bias and scale type can be used to improve accuracy and reliability of disease severity estimates and can provide a framework for improving aids to estimate severity visually, including standard area diagrams and rater training software. Technical Abstract: The effects of bias (over and underestimates) in estimates of disease severity on hypothesis testing using different assessment methods was explored. Nearest percent estimates (NPE), the Horsfall-Barratt (H-B) scale, and two different linear category scales (10% increments, with and without additional grades at low severity) were compared using simulation modeling. Type I and type II error rates were used to compare effects of biased and unbiased estimates of treatment effects. The power of the H-B scale and the 10% scale were least for correctly testing an hypothesis compared with the other methods, and the effects of rater bias on type II errors are greater over specific severity ranges. The amended 10% category scale was most often superior to other methods at all severities tested for reducing the risk of type II errors, and is thus preferred if raters need to estimate severity on a disease scale. Rater bias and assessment method had little effect on type I error rates. The power of the hypothesis test using unbiased estimates was most often greater compared with biased estimates, regardless of assessment method. An unanticipated but critical observation was the greater impact of rater bias compared with assessment method on type II errors. We believe that knowledge of the effects of rater bias and scale type on hypothesis testing can be used to improve accuracy and reliability of disease severity estimates and can provide a logical framework for improving aids to estimate severity visually, including standard area diagrams and rater training software. |