Lung Vision: Leveraging CNNs And Vision Transformers For Accurate And Scalable Lung Cancer Diagnosis | International Journal of Science, Engineering and Technology

Authors: Zeeshan Ahmad, Mamta Sharma

Abstract: Proposed framework-LungVision is a deep learning framework for automated lung cancer detection using CT and histopathological images. It integrates Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). CNNs are good at extracting local features while ViTs utilize self-attention mechanisms to capture global context. This combination helps to extract local and global feature extraction. The sole CNN application achieved a detection accuracy of 94.063%, while the best performing ViT model achieved 84.211%. The study evaluates both models using balanced and unbalanced datasets under supervised and transfer learning setups. Performance is assessed using standard metrics including accuracy, precision, recall, and F1 score. With CNN integrated ViTs, cancerous fea- tures are extracted with precision 97%. The proposed framework is also deployed as a real-time web application, offering an accessible solution for early lung cancer diagnosis in case of cancer detection.

DOI: http://doi.org/10.5281/zenodo.20638073