A Novel Transformer Model With Multiple Instances Learning For Diabetic

25 Jul

Authors: Mrs Dr.S.Balaji, Mr.Abdulla

Abstract: Diabetic Retinopathy (DR) is a leading cause of vision impairment among diabetic patients, and its timely detection is crucial for preventing irreversible vision loss. This work presents a novel hybrid model that combines Convolutional Neural Networks (CNN), EfficientNetB0, Transformer architectures, and Multiple Instance Learning (MIL) for accurate DR classification and recommendation. EfficientNetB0 serves as a lightweight yet powerful backbone for extracting high-quality features from retinal fundus images, while the Transformer module captures global context and spatial relationships across image regions. MIL enhances model robustness by treating each image as a collection of instances, allowing effective learning even with limited or weak labels. Additionally, the system provides severity-based recommendations to assist clinicians in prioritizing patient care. Experimental results across multiple DR datasets demonstrate superior accuracy, generalization, and clinical relevance, highlighting the model’s potential for integration into real-world diabetic screening workflows.

DOI: https://doi.org/10.5281/zenodo.16535991