Location: Corn Insects and Crop Genetics Research
Title: Self-supervised learning improves classification of agriculturally important insect pests in plantsAuthor
KAR, SOUMYASHREE - Iowa State University | |
NAGASUBRAMANIAN, KOUSHIK - Iowa State University | |
ELANGO, DINAKARAN - Iowa State University | |
CARROLL, MATTHEW - Iowa State University | |
Abel, Craig | |
NAIR, AJAY - Iowa State University | |
MUELLER, DAREN - Iowa State University | |
O'NEAL, MATTHEW - Iowa State University | |
SINGH, ASHEESH - Iowa State University | |
SARKAR, SOUMIK - Iowa State University | |
GANAPATHYSUBRAMANIAN, BASKAR - Iowa State University | |
SINGH, ARTI - Iowa State University |
Submitted to: The Plant Phenome Journal
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 6/26/2023 Publication Date: 7/18/2023 Citation: Kar, S., Nagasubramanian, K., Elango, D., Carroll, M.E., Abel, C.A., Nair, A., Mueller, D.S., O'Neal, M.E., Singh, A.K., Sarkar, S., Ganapathysubramanian, B., Singh, A. 2023. Self-supervised learning improves classification of agriculturally important insect pests in plants. The Plant Phenome Journal. 6(1). https://doi.org/10.1002/ppj2.20079. DOI: https://doi.org/10.1002/ppj2.20079 Interpretive Summary: Up to 40% of global food production is lost to insect pests each year with estimated revenue losses of $220 billion. Reliable machine identification of insects in the field would improve the accurate and early detection of pests that would then enable effective control measures being applied before significant yield losses occur. Methods and models used to machine identify insect pests to species have been developed but the failure rate remains high. The limiting factor towards achieving a higher success rate is the amount of time needed for expert human involvement creating labeled data that is used to train machine learning models. A self-supervised machine learning approach would resolve this limiting factor. For our study, we present a self-supervised learning (SSL) approach to classify digital images of 22 types of agriculturally important insect pests and the approach was assessed using three existing SSL methods. These results demonstrate an efficient insect classification tool that can handle large and imbalanced data sets to accurately identify economically important pests via digital imagery in field and horticultural crop production systems. Technical Abstract: Background: Insect pests cause significant damage to food production; therefore, researchers and farmers are interested in developing detection and mitigation strategies. There is a continual shift towards automated detection of insect pests using machine learning approaches because of the sheer number of species and complexity/overlap of identifying features. Although supervised learning has achieved remarkable progress in this regard, it is impeded by the necessity of significant expert human involvement for creating the labeled data used in training the machine learning models. This makes applications involving real-world crop settings tedious and oftentimes infeasible. Self-supervised machine learning approaches provide a viable alternate approach to train models with minimal expert annotations. Here, we present a self-supervised learning (SSL) approach to classify 22 types of agriculturally important insect pests using minimal labeling. The framework was assessed on both raw and segmented field captured images of insect pests, using three different SSL methods, NNCLR (Nearest Neighbor Contrastive Learning of Visual Representations), BYOL (Bring your Own Latents), and Barlow Twins. Results: SSL pre-training was done on ResNet-18 and ResNet-50 models using all the three SSL methods on the original RGB images and foreground segmented images. The performance of SSL pre-training methods was evaluated using linear probing of SSL representations and end-to-end fine-tuning approaches. Our experiments show that SSL pre-trained Convolutional Neural Network models were able to perform annotation efficient insect classification. NNCLR was the best performing SSL method for both linear and full model fine-tuning. Using annotations on just 5% of collected image data, transfer learning with ImageNet initialization obtained 74% accuracy whereas NNCLR obtained an improved classification accuracy of 79% for end-to-end fine-tuning. Conclusion: Models created using SSL pre training consistently performed better at insect classification, especially under very low annotation availability. Furthermore, we observed that SSL pre-training produces models that are more robust to object class imbalances. These approaches overcome annotations/labeling task bottlenecks providing significant resource savings to practitioners working on deploying ML based automated identification and classification tasks. |