Hybrid Optimization-Based TabTransformer For Type 2 Diabetes Risk Prediction

29 Apr

Authors: Md. Shorifuzzaman, Annesha Hossain Noushin

Abstract: Type 2 Diabetes Mellitus (T2DM) is a rapidly growing global health problem where delayed diagnosis can lead to severe complications such as cardiovascular disease, kidney failure, neuropathy, and vision impairment. Existing machine learning approaches often suffer from limited interpretability, class imbalance, and inadequate optimization, reducing their clinical reliability. This study proposes an explainable and optimized deep-learning framework for T2DM risk prediction using a TabTransformer architecture with hybrid hyperparameter optimization and explainable artificial intelligence (XAI). A publicly available dataset of 100,000 patient records was preprocessed using encoding, standardization, and the Synthetic Minority Oversampling Technique (SMOTE). The model was optimized using Bayesian optimization (Optuna) followed by Particle Swarm Optimization (PSO) and evaluated using standard classification metrics. The optimized model achieved approximately 93% AUC and accuracy with improved recall for diabetic cases. SHAP analysis identified key risk factors, including glucose level, HbA1c, BMI, age, and hypertension, and a web-based interface enabled instant prediction, demonstrating real-time feasibility. The proposed system can serve as a clinical decision-support tool for early diabetes screening.

DOI: https://doi.org/10.5281/zenodo.19883294