Authors: Sneha, Mr. Yug Lohchab, Dr. Akhilesh Das Gupta, Guru Gobind Singh
Abstract: Breast cancer remains the most prevalent malignancy among women globally, with approximately 2.3 million new diagnoses annually. Early and accurate automated detection is clinically critical. This paper proposes Breast Scan AI, a novel Weighted Soft Voting Ensemble (WSVE) integrating five heterogeneous base classifiers: Random Forest (RF), Extra Trees (ET), Support Vector Machine with RBF kernel (SVM-RBF), Logistic Regression (LR), and Gradient Boosting (GB). The proposed model is evaluated on the Wisconsin Breast Cancer Dataset (WBCD, UCI) comprising 569 instances and 30 cytological features. The ensemble achieves 97.37% accuracy, 97.26% precision, 98.61% recall, 97.93% F1score, and 99.60% AUC-ROC — outperforming all individual base classifiers and prior ensemble work on this benchmark. Ten-fold stratified cross-validation confirms stability at 97.37% ± 2.39%. Robust Scaler preprocessing is introduced as a key novelty for handling clinical outliers. The system is deployed as a zero dependency, real-time Clinical Decision Support System (CDSS).
International Journal of Science, Engineering and Technology