Visual-language transformer-based tomato leaf disease detection for portable greenhouse monitoring device

基于视觉语言转换器的番茄叶片病害检测在便携式温室监测设备中的应用

阅读:1

Abstract

Tomato leaf diseases pose a significant threat to global food security, necessitating accurate and efficient detection methods. This paper introduces the Tomato Leaf Disease Visual Language Model (TLDVLM), a novel approach based on the BLIP-2 architecture enhanced with Low-Rank Adaptation (LoRA), for precise classification of 10 distinct tomato leaf diseases. Our methodology integrates a sophisticated image preprocessing pipeline, utilizing GroundingDINO for robust leaf detection and SAM-2 for pixel-level segmentation, ensuring that the model focuses solely on relevant plant tissue. The TLDVLM leverages the powerful multimodal understanding of BLIP-2, with LoRA applied to its Q-Former module, enabling parameter-efficient fine-tuning without compromising performance. Comparative experiments demonstrate that the TLDVLM significantly outperforms baseline models, including CLIP-LoRA and ConvNeXT-tiny, achieving an accuracy of 97.27%, a precision of 0.9587, a recall of 0.9789, and an F1-score of 0.9681. Beyond classification, the finetuned TLDVLM checkpoints are integrated into a practical application for new image inference. This application displays the raw and segmented images, the predicted disease, and offers functionalities to fetch comprehensive information on disease causes and remedies using external APIs (e.g., OpenAI), with an option to download a PDF summary for offline access on a portable device. This research highlights the potential of LoRA-adapted Vision-Language Models in developing highly accurate, efficient, and user-friendly agricultural diagnostic tools.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。