Authors: Mrs. V. Suvarna, Mallidi Mohana Sudha, Gunturi Satyasai Phani Amrutha Sri Varshini, Mungara Shakthi Indra Varma, Gangapatrula Ram Karthikeyan, Nalli Prema Chandhu
Abstract: Phishing attacks have become one of the most common cybersecurity threats, targeting users by creating fraudulent websites that mimic legitimate platforms to steal sensitive information such as login credentials, financial data, and personal identity details. Traditional phishing detection approaches, such as blacklist-based systems and manual verification methods, are often inefficient and unable to detect newly emerging phishing websites in real time. Therefore, intelligent and automated detection mechanisms are required to improve cybersecurity and protect users from online fraud. This study proposes an efficient machine learning–based framework for detecting phishing websites using URL and domain-based features. The proposed system utilizes a dataset containing both legitimate and phishing website URLs collected from publicly available repositories. Data preprocessing techniques are applied to clean and normalize the dataset, ensuring consistency and improving model performance. Multiple machine learning algorithms including Logistic Regression, Decision Tree, Random Forest, AdaBoost, and Gradient Boosting are implemented and evaluated using stratified cross-validation techniques to ensure reliable prediction results. Among the evaluated models, ensemble learning algorithms demonstrate superior performance due to their ability to combine multiple weak learners and reduce prediction errors. In particular, the Random Forest classifier achieves high detection accuracy by analyzing key URL characteristics such as domain name structure, prefix and suffix usage, DNS records, URL length, and IP address patterns. The experimental results show that the ensemble model effectively distinguishes between legitimate and phishing websites with high accuracy, precision, recall, and F1-score.Furthermore, feature importance analysis is performed to identify the most influential attributes contributing to phishing detection, enabling better understanding of model behaviour and improving system transparency. The proposed framework provides a scalable and automated solution for detecting malicious websites, helping users identify fraudulent URLs before interacting with them. Overall, the proposed machine learning framework enhances phishing detection capability, improves cybersecurity awareness, and provides an efficient tool for protecting users against online phishing attacks.
DOI:
International Journal of Science, Engineering and Technology