TomatoRipen-MMT: transformer-based RGB and NIR spectral fusion for tomato maturity grading

TomatoRipen-MMT:基于Transformer的RGB和NIR光谱融合番茄成熟度分级

阅读:1

Abstract

Computer vision and multispectral imaging have increasingly become essential tools in modern precision agriculture. Accurate ripeness assessment is critical for yield optimization, reducing post-harvest losses, and enabling automated harvesting systems. However, traditional RGB-based approaches struggle to differentiate subtle maturity changes, and existing solutions often fail under varying lighting, occlusion, or cultivar-specific conditions. To address these challenges, this study focuses on the integration of complementary spectral cues for reliable tomato ripeness evaluation. The work utilizes a curated RGB-NIR tomato dataset comprising 224 hyperspectral samples, processed into aligned multimodal image pairs with balanced ripeness categories.The proposed TomatoRipen-MMT model employs a multimodal Transformer framework with dual encoders, cross-spectral attention, and a joint decoder to fuse spatial and biochemical cues. The novelty of the methodology lies in the dynamic cross-attention mechanism, which learns inter-modal dependencies between RGB and NIR signals for enhanced ripeness interpretation. Performance metrics including accuracy, precision, recall, F1-score, mIoU, and AUC were used to comprehensively evaluate the system. Experimental results demonstrate that TomatoRipen-MMT significantly outperforms all baseline RGB-only, NIR-only, and fusion methods, achieving 94.8% classification accuracy and 82.6% mIoU. These findings establish the effectiveness of multimodal Transformers for robust, high-precision fruit maturity assessment in controlled and greenhouse environments.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。