DenseSwinV2: Channel Attentive Dual Branch CNN Transformer Learning for Cassava Leaf Disease Classification

AI in healthcare
Published: arXiv: 2603.25935v1
Authors

Shah Saood Saddam Hussain Khan

Abstract

This work presents a new Hybrid Dense SwinV2, a two-branch framework that jointly leverages densely connected convolutional features and hierarchical customized Swin Transformer V2 (SwinV2) representations for cassava disease classification. The proposed framework captures high resolution local features through its DenseNet branch, preserving the fine structural cues and also allowing for effective gradient flow. Concurrently, the customized SwinV2 models global contextual dependencies through the idea of shifted-window self attention, which enables the capture of long range interactions critical in distinguishing between visually similar lesions. Moreover, an attention channel-squeeze module is employed for each CNN Transformer stream independently to emphasize discriminative disease related responses and suppress redundant or background driven activations. Finally, these discriminative channels are fused to achieve refined representations from the dense local and SwinV2 global correlated strengthened feature maps, respectively. The proposed Dense SwinV2 utilized a public cassava leaf disease dataset of 31000 images, comprised of five diseases, including brown streak, mosaic, green mottle, bacterial blight, and normal leaf conditions. The proposed Dense SwinV2 demonstrates a significant classification accuracy of 98.02 percent with an F1 score of 97.81 percent, outperforming well-established convolutional and transformer models. These results underline the fact that Hybrid Dense SwinV2 offers robustness and practicality in the field level diagnosis of cassava disease and real world challenges related to occlusion, noise, and complex backgrounds.

Paper Summary

Problem
Cassava leaf disease is a major problem for smallholder farmers in sub-Saharan Africa, causing significant losses in crop yields and affecting food security. Traditional diagnosis methods, such as manual inspections, are time-consuming, costly, and often inaccurate. There is a need for an automated and efficient disease diagnosis system to improve crop productivity and livelihoods.
Key Innovation
The proposed Hybrid Dense-SwinV2 model combines the strengths of two different architectures: DenseNet and SwinV2. DenseNet is good at capturing high-resolution local features, while SwinV2 excels at modeling long-range dependencies in images. The model uses a dual-branch structure, where the outputs of both branches are fused to generate refined representations. This approach allows for effective gradient flow and feature reuse, making it more robust and accurate.
Practical Impact
The Hybrid Dense-SwinV2 model has the potential to revolutionize cassava leaf disease diagnosis, making it faster, more accurate, and accessible to farmers in resource-poor areas. With its high accuracy of 98.02% and F1-score of 97.81%, this model can be used to detect diseases early, reducing the risk of crop losses and improving food security. The model's efficiency and robustness make it a valuable tool for agricultural pathology, and it can be applied to other tasks, such as class imbalance, low contrast, or complex visual patterns.
Analogy / Intuitive Explanation
Imagine a doctor trying to diagnose a patient with a rare disease. Traditional methods might involve looking at a few symptoms and making an educated guess. However, with the Hybrid Dense-SwinV2 model, it's like having a supercomputer that can examine millions of data points, including images of the patient's symptoms, to make an accurate diagnosis. This model can "see" patterns and connections that a human doctor might miss, making it a powerful tool for disease diagnosis and treatment.
Paper Information
Categories:
cs.CV cs.AI
Published Date:

arXiv ID:

2603.25935v1

Quick Actions