Abstract
INTRODUCTION: Agriculture is crucial to human survival. The growing of biotic rice plants is very helpful for feeding a lot of people around the world, especially in places where rice is a main food. The detection of rice leaf disease is critical to increasing crop productivity. METHODS: To improve the accuracy of rice leaf disease prediction, this paper proposes a hybrid Vision Transformer (ViT) with pre-trained ResNet18 models (ViT-ResNet18). In general, the input images apply to the pre-trained ViT and ResNet18 models independently. The output features of these two models are combined and fed into the final Fully Connected (FC) layer, followed by a Softmax layer for final classification. RESULTS: The output of rice leaf diseases from the FC layer of the proposed hybrid ViT with ResNet18 model achieved 94.4% accuracy, a precision of 0.948, a recall of 0.944, an F1-Score of 0.942, and an Area Under Curve (AUC) of 0.985. DISCUSSION: The proposed hybrid model ViT-ResNet18 shows a 5%, 1%, and 1% improvement in accuracy compared to VGG16 with Neural Network, Inception V3 with Neural Network, and SqueezeNet with Neural Network classifier, respectively.