Authors :
Karra.NAGA SIVA SURYA DILEEP; Krovvidi.KARTHIK SAI SRI RAMA RAJU; Karella.JOHNY; Kombathula.VENKAT; P.SRINU VASA RAO
Volume/Issue :
Volume 9 - 2024, Issue 2 - February
Google Scholar :
http://tinyurl.com/ys2thp5s
Scribd :
http://tinyurl.com/2umktn79
DOI :
https://doi.org/10.5281/zenodo.10725536
Abstract :
Spam, sometimes called spam, is unsolicited
email that is typically sent to large lists of recipients.
Although real individuals can send spam, botnets
(computer networks infected by an attacker known as a
"bully") are often responsible for sending spam. While
most people view spam as a problem, they believe it is a
result of email communication. In addition to being
annoying, spam can also be dangerous because it can
clog email inboxes if not filtered properly and deleted
frequently.
Spammers or spammers often change their methods
and content to trick victims into downloading malware,
sharing personal information, or feeding money. Most
spam is commercial in nature and financially motivated.
Spammers attempt to deceive recipients by making false
claims, selling questionable products, and promoting
false information.
Unwanted emails, such as phishing and spam, cost
businesses and individuals billions of dollars each year.
Many models and techniques for automatic spam
detection have been introduced and developed, but
100% accuracy has not yet been found. Among all
designs, machine and deep learning algorithms are more
successful. Natural language processing (NLP) improves
model accuracy. This study presents the effectiveness of
word embedding in spam classification.
Preliminary study Transformer model BERT
(Bidirectional Encoder Represented by Transformers) is
well tuned to accomplish the task of identifying spam
from non-spam (HAM). BERT uses a color layer to place
the content of the text into its perspective. The results
were compared with the basic DNN (Deep Neural
Network) model consisting of BiLSTM (Bidirectional
Long Term Memory) layer and two thick layers.
Here are some of the most popular spam topics:
Pharmaceuticals, financial services, working from
home, porn, online courses and cryptocurrency.
Keywords :
Machine Learning, Natural Language Processing, Spam, Ham, Email, Naive Bayes, Logistic Regression.
Spam, sometimes called spam, is unsolicited
email that is typically sent to large lists of recipients.
Although real individuals can send spam, botnets
(computer networks infected by an attacker known as a
"bully") are often responsible for sending spam. While
most people view spam as a problem, they believe it is a
result of email communication. In addition to being
annoying, spam can also be dangerous because it can
clog email inboxes if not filtered properly and deleted
frequently.
Spammers or spammers often change their methods
and content to trick victims into downloading malware,
sharing personal information, or feeding money. Most
spam is commercial in nature and financially motivated.
Spammers attempt to deceive recipients by making false
claims, selling questionable products, and promoting
false information.
Unwanted emails, such as phishing and spam, cost
businesses and individuals billions of dollars each year.
Many models and techniques for automatic spam
detection have been introduced and developed, but
100% accuracy has not yet been found. Among all
designs, machine and deep learning algorithms are more
successful. Natural language processing (NLP) improves
model accuracy. This study presents the effectiveness of
word embedding in spam classification.
Preliminary study Transformer model BERT
(Bidirectional Encoder Represented by Transformers) is
well tuned to accomplish the task of identifying spam
from non-spam (HAM). BERT uses a color layer to place
the content of the text into its perspective. The results
were compared with the basic DNN (Deep Neural
Network) model consisting of BiLSTM (Bidirectional
Long Term Memory) layer and two thick layers.
Here are some of the most popular spam topics:
Pharmaceuticals, financial services, working from
home, porn, online courses and cryptocurrency.
Keywords :
Machine Learning, Natural Language Processing, Spam, Ham, Email, Naive Bayes, Logistic Regression.