Enhanced phishing website detection leveraging random forest and xgboost algorithms with hybrid features| International Journal of Innovative Science and Research Technology

Enhanced Phishing Website Detection: Leveraging Random Forest and XGBoost Algorithms with Hybrid Features

Authors : Ashwini Bhavsar; Adarsh Waikar; Ayush Petkar; Seema Mane; Vishwatej Sarwale

Volume/Issue : Volume 8 - 2023, Issue 7 - July

Google Scholar : https://bit.ly/3TmGbDi

Scribd : https://tinyurl.com/3aw5bpxp

DOI : https://doi.org/10.5281/zenodo.8181388

Abstract : Phishing technique is used by hackers or attackers to scam the people on internet into giving private details such as login credentials of various profiles, social security numbers (SSNs), banking information, etc. Attackers disguise a webpage as an official legit website. Blacklist or whitelist, heuristic, and visual similarity-based anti-phishing solutions are unable to detect zero-hour phishing assaults or newly created websites. Older methods are more complex and not suitable for day-to-day scenarios since they rely on external sources such as search engines. As a result, finding newly constructed phishing websites in a real- time context is a significant hurdle in the field of cybersecurity. This paper presents a hybrid feature-based anti-phishing approach that nullifies these problems by extracting characteristics from URL and hyperlink data that is only available on the client side. Also, a brand-new dataset is created for experiments employing well-liked machine-learning classification techniques.

Keywords : Cybersecurity, Phishing Detection, Machine Learning, Hyperlink Feature, URL Feature, Anti-Phishing, XG Boost, Hybrid Feature.

Phishing technique is used by hackers or attackers to scam the people on internet into giving private details such as login credentials of various profiles, social security numbers (SSNs), banking information, etc. Attackers disguise a webpage as an official legit website. Blacklist or whitelist, heuristic, and visual similarity-based anti-phishing solutions are unable to detect zero-hour phishing assaults or newly created websites. Older methods are more complex and not suitable for day-to-day scenarios since they rely on external sources such as search engines. As a result, finding newly constructed phishing websites in a real- time context is a significant hurdle in the field of cybersecurity. This paper presents a hybrid feature-based anti-phishing approach that nullifies these problems by extracting characteristics from URL and hyperlink data that is only available on the client side. Also, a brand-new dataset is created for experiments employing well-liked machine-learning classification techniques.

Keywords : Cybersecurity, Phishing Detection, Machine Learning, Hyperlink Feature, URL Feature, Anti-Phishing, XG Boost, Hybrid Feature.