Location: Quality and Safety Assessment Research Unit
Title: Development of Multimodal Fusion Technology for Tomato Maturity AssessmentAuthor
LIU, YANG - China Agricultural University | |
WEI, CHAOJIE - China Agricultural University | |
Yoon, Seung-Chul | |
Ni, Xinzhi | |
WANG, WEI - China Agricultural University | |
LIU, YIZHE - China Agricultural University | |
WANG, DAREN - China Agricultural University | |
WANG, XIAORONG - China Agricultural University |
Submitted to: Sensors
Publication Type: Peer Reviewed Journal Publication Acceptance Date: 4/10/2024 Publication Date: 4/11/2024 Citation: Liu, Y., Wei, C., Yoon, S.C., Ni, X., Wang, W., Liu, Y., Wang, D., Wang, X. 2024. Development of Multimodal Fusion Technology for Tomato Maturity Assessment. Sensors. 24(8):2467. https://doi.org/10.3390/s24082467. DOI: https://doi.org/10.3390/s24082467 Interpretive Summary: Rapid, non-destructive assessment of fruit and vegetable maturity plays a vital role in agricultural and food production systems. Tomatoes become mature from the inside out but their maturing and ripening stages are not uniform in color and physiochemical and physiomechanical properties, making the accurate and comprehensive assessment of single-modal sensing techniques limited to certain properties only. This study proposes a deep learning-powered sensor and data fusion technique using multiple rapid, non-destructive sensing modalities like color imaging, near-infrared spectroscopy, and haptic sensing to assess tomato maturity. A deep learning model processed and combined data from three sensing modalities: color images, spectral data, and haptic pressure data. This process involved feature extraction, feature fusion, and the prediction of ripening stages. The study results showed that the classification accuracy of the proposed deep learning model in predicting a maturity stage (immature, semi-mature, and mature) was 99.5%, exceeding the performance of single-modal techniques: color imaging (94.2%), spectroscopy (87.8%), and haptics (87.2%). The study finding suggests that deep learning-based multimodal sensor fusion offers advantages over single-sensor approaches for rapid, non-destructive assessment of maturity, and other quality and safety attributes like pest and viral infection in tomatoes and other agricultural and food products. Technical Abstract: The maturity of fruits and vegetables such as tomatoes significantly impacts their quality, such as taste, nutritional value, and shelf life, making maturity determination vital in agricultural production and the food processing industry. Tomatoes are mature from the inside out, leading to an uneven ripening process inside and outside, and these situations make it very challenging to judge maturity with the help of a single modality. In this paper, we propose a deep learning-assisted multimodal data fusion technique combining color imaging, spectroscopy, and haptic sensing for the maturity assessment of tomatoes. The method uses feature fusion to integrate feature information from images, near-infrared spectra, and haptic modalities into a unified feature set and then classifies the maturity of tomatoes through deep learning. Each modality independently extracted the features, capturing tomatoes' exterior color from color images, internal and surface spectral features linked to chemical compositions in the visible and near-infrared spectra (350 nm to 1100 nm), and physical firmness using haptic sensing. By combining preprocessed and extracted features from multiple modalities, data fusion created a comprehensive representation of information from all three modalities using the eigenvector in an eigenspace suitable for tomato maturity assessment. Then, a fully connected neural network was constructed to process these fused data. This neural network model achieved 99.4% accuracy in tomato maturity classification, surpassing single-modal methods (color imaging: 94.2%, spectroscopy: 87.8%, haptics: 87.2%). For internal and external maturity unevenness, the classification accuracy reached 94.4%, demonstrating effective results. The comparative analysis of the performance between multimodal fusion and single-modal methods validates the stability and applicability of the multimodal fusion technique. The findings demonstrated the key benefits of multimodal fusion in terms of improving the accuracy of tomato ripening classification and provided a strong theoretical and practical basis for applying multimodal fusion technology to classify the quality and maturity of other fruits and vegetables. Utilizing deep learning (fully connected neural network) for processing multi-modal data provides a new and efficient non-destructive approach for the massive classification of agricultural and food products. |