Cyberbullying Detection on Twitter using Machine Learning: A Review


Authors : Salisu Suleiman; Prashansa Taneja; Ayushi Nainwal

Volume/Issue : Volume 7 - 2022, Issue 6 - June

Google Scholar : https://bit.ly/3IIfn9N

Scribd : https://bit.ly/3HWydLj

DOI : https://doi.org/10.5281/zenodo.6757912

Abstract : The internet has pervaded every part of human life making it easier to link individuals across the globe and disseminate information to a large group of people. Despite its importance, the cyberworld has a number of negative effects on people today. One of the most dangerous threats in the cyberworld is cyberbullying as it destroys individuals' reputation or privacy, threatens or harasses them, and has a long-term impact on the victim. Despite the issue has been in existence for many years, the impact on young people has just recently become more widely recognized. Using machine learning and natural language processing, the bullies' harassing tweets or offensive comments may be automatically identified and detected. This paper reviewed the previous research in cyberbullying detection domain and more importantly, proposed a novel cyberbullying detection model to close the gap that was discovered during the review of the related literature. In this study, we employed standard supervised learning method and ensemble supervised learning method. The traditional methods used three ML classifiers: Gaussian Naïve Bayes (GNV), Logistic Regression (LR), and Decision Tree (DT) classifiers, While Adaboost and Random Forest (RF) classifiers were used as ensemble technique. We trained and tested our model to detect and classify bullying content as either bullying or nonbullying (binary classification model) using our dataset, and Termed Frequency Inverse Document Frequency (Tf-idf) was used to extract features from a twitter dataset downloaded from kaagle.

Keywords : Cyberworld, Social Media, Machine Learning, Cyber Bullying.

The internet has pervaded every part of human life making it easier to link individuals across the globe and disseminate information to a large group of people. Despite its importance, the cyberworld has a number of negative effects on people today. One of the most dangerous threats in the cyberworld is cyberbullying as it destroys individuals' reputation or privacy, threatens or harasses them, and has a long-term impact on the victim. Despite the issue has been in existence for many years, the impact on young people has just recently become more widely recognized. Using machine learning and natural language processing, the bullies' harassing tweets or offensive comments may be automatically identified and detected. This paper reviewed the previous research in cyberbullying detection domain and more importantly, proposed a novel cyberbullying detection model to close the gap that was discovered during the review of the related literature. In this study, we employed standard supervised learning method and ensemble supervised learning method. The traditional methods used three ML classifiers: Gaussian Naïve Bayes (GNV), Logistic Regression (LR), and Decision Tree (DT) classifiers, While Adaboost and Random Forest (RF) classifiers were used as ensemble technique. We trained and tested our model to detect and classify bullying content as either bullying or nonbullying (binary classification model) using our dataset, and Termed Frequency Inverse Document Frequency (Tf-idf) was used to extract features from a twitter dataset downloaded from kaagle.

Keywords : Cyberworld, Social Media, Machine Learning, Cyber Bullying.

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe