Authors: T. T. Visali, D. Harini, S. Prathi
Abstract: Stroke is a leading global cause of mortality and long-term disability, yet the majority of strokes are preventable through early risk stratification and timely clinical intervention. This paper presents the design, implementation, and evaluation of a web-based stroke risk prediction system that integrates ensemble machine learning with a Flask-based clinical decision support interface. Five classification algorithms — Logistic Regression, Decision Tree, K-Nearest Neighbours, Support Vector Machine (SVM), and Random Forest — are trained and compared on the publicly available Kaggle stroke prediction dataset (n = 5,110 records, 11 clinical and demographic features). Class imbalance, which afflicts 95.13% of records as non-stroke, is addressed through Synthetic Minority Over-sampling Technique (SMOTE) before model training. Random Forest achieves the highest performance, with an accuracy of 88.7%, precision of 87.4%, recall of 83.2%, F1-score of 85.3%, and AUC-ROC of 0.918. The serialised model is deployed through a Flask web application that accepts eleven clinical inputs, executes real-time inference, and returns a binary stroke risk prediction with an explanatory probability score. Comparative benchmarking against four published stroke prediction studies confirms that the proposed system achieves competitive accuracy and is the only implementation among the compared works to integrate both SMOTE-balanced ensemble modelling and a deployable web interface within a unified pipeline. The system is intended as a low-cost clinical decision-support tool for healthcare practitioners and risk-aware individuals in resource-limited settings.
International Journal of Science, Engineering and Technology