Authors :
U Karthik; Srinidhi K S; Mithun Gowda H
Volume/Issue :
Volume 10 - 2025, Issue 2 - February
Google Scholar :
https://tinyurl.com/yhnbew7n
Scribd :
https://tinyurl.com/4cw4d7rn
DOI :
https://doi.org/10.5281/zenodo.14965910
Abstract :
A recent development in natural language processing (NLP) is sentiment analysis of tweets leveraging emoticons,
which uses emoticons' expressive potential to determine the sentiment contained in textual data. Emoticons, which are tiny
visual representations of emotions, offer a natural approach to improve comprehension of the feelings expressed in
conversations, social media posts, and other unofficial text formats. This approach consists of a number of crucial
processes, beginning with data preprocessing, which cleans and normalizes texts. Next, emoticon extraction is used to find
and classify emoticons into predetermined sentiment classifications, such positive and negative. By assembling related
emoticons and the written information that goes with them into clusters, the k-means clustering algorithm plays a crucial
part in this study by making it easier to spot common sentiment patterns. By dividing the dataset into k clusters according
to feature similarity, the unsupervised learning algorithm K-means minimizes the variance within each cluster. The
analysis can effectively manage massive amounts of data by using k-means clustering, which offers insights into the
prevailing sentiment trends and how they change over time.
Additionally, by guaranteeing that contextual subtleties are recorded, the combination of clustering and natural
language processing (NLP) approaches improves sentiment analysis and sentiment classification accuracy. The generated
clusters help with activities like market analysis, customer feedback evaluation, and social media monitoring by providing
a detailed view of the sentiment landscape. Essentially, a strong foundation for extracting and analysing feelings is
provided by emoticon-based sentiment analysis utilizing NLP with k-means clustering, which promotes improved decision-
making and a deeper comprehension of the audience.
Keywords :
Machine learning, Sentiment analysis, Tweet Classification, Natural Language Processing (NLP), Machine Learning, Text Preprocessing, Tokenization, Noise Reduction, Emoticon Extraction, Sentiment Score Assignment, BiLSTM, DistilBERT, Textual Sentiment, Emoticon-Based Sentiment.
References :
- M. Alfreihat, O. S. Almousa, Y. Tashtoush, A. AlSobeh, K. Mansour and H. Migdady, "Emo-SL Framework: Emoji Sentiment Lexicon Using Text-Based Features and Machine Learning for Sentiment Analysis," in IEEE Access, vol. 12, pp. 81793-81812,doi:10.1109/ACCESS.2024.3382836 ,2024.
- A. Sharma and U. Ghose, "Toward Machine Learning Based Binary Sentiment Classification of Movie Reviews for Resource Restraint Language (RRL)—Hindi," in IEEE Access, vol. 11,pp. 58546-58564, doi: 10.1109/ACCESS.2023.3283461, 2023.
- S. Bengesi, T. Oladunni, R. Olusegun and H. Audu, "A Machine Learning-Sentiment Analysis on Monkeypox Outbreak: An Extensive Dataset to Show the Polarity of Public Opinion From Twitter Tweets," in IEEE Access, vol. 11, pp. 11811-11826,doi:10.1109/ACCESS.2023.3242290, 2023.
- R. Godard and S. Holtzman, "The Multidimensional Lexicon of Emojis: A New Tool to Assess the Emotional Content of Emojis," Frontiers in Psychology, vol. 13. doi:10.3389/fpsyg.2022.921388, 2022.
- H. He, G. Zhou and S. Zhao, "Exploring E-Commerce Product Experience Based on Fusion Sentiment Analysis Method," in IEEE Access, vol. 10, pp. 110248-110260, 2022, doi: 10.1109/ACCESS.2022.3214752.
- Petra Kralj Novak, Jasmina Smailović, Borut Sluban, and Igor Mozetič. Sentiment of emojis. PLoS ONE 10, 12 ,e0144296. http://dx.doi.org/10.1371/journal.pone.0144296.
- M. J. Li, M. K. Ng, Y. m. Cheung, and J. Z. Huang. Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters. IEEE Transactions on Knowledge and Data Engineering.https://doi.org/10.1109/TKDE.2008.88
- Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Finkel, Steven J.Bethard,and David McClosky. The Stanford Core NLP Natural Language Processing Toolkit. In Association for Computational Linguistics (ACL) System Demonstrations. 55–60.
- K. Rani Narejo, H. Zan, D. Oralbekova, K. Parkash Dharmani, M. Orken and K. Mukhsina, "Enhancing Emoji-Based Sentiment Classification in Urdu Tweets: Fusion Strategies With Multilingual BERT and Emoji Embeddings," in IEEE Access, vol. 12, pp. 126587-126600, 2024, doi: 10.1109/ACCESS.2024.3446897.
- K. R. Narejo et al., "EEBERT: An Emoji-Enhanced BERT Fine-Tuning on Amazon Product Reviews for Text Sentiment Classification," in IEEE Access, vol. 12, pp. 131954-131967, 2024, doi: 10.1109/ACCESS.2024.3456039
A recent development in natural language processing (NLP) is sentiment analysis of tweets leveraging emoticons,
which uses emoticons' expressive potential to determine the sentiment contained in textual data. Emoticons, which are tiny
visual representations of emotions, offer a natural approach to improve comprehension of the feelings expressed in
conversations, social media posts, and other unofficial text formats. This approach consists of a number of crucial
processes, beginning with data preprocessing, which cleans and normalizes texts. Next, emoticon extraction is used to find
and classify emoticons into predetermined sentiment classifications, such positive and negative. By assembling related
emoticons and the written information that goes with them into clusters, the k-means clustering algorithm plays a crucial
part in this study by making it easier to spot common sentiment patterns. By dividing the dataset into k clusters according
to feature similarity, the unsupervised learning algorithm K-means minimizes the variance within each cluster. The
analysis can effectively manage massive amounts of data by using k-means clustering, which offers insights into the
prevailing sentiment trends and how they change over time.
Additionally, by guaranteeing that contextual subtleties are recorded, the combination of clustering and natural
language processing (NLP) approaches improves sentiment analysis and sentiment classification accuracy. The generated
clusters help with activities like market analysis, customer feedback evaluation, and social media monitoring by providing
a detailed view of the sentiment landscape. Essentially, a strong foundation for extracting and analysing feelings is
provided by emoticon-based sentiment analysis utilizing NLP with k-means clustering, which promotes improved decision-
making and a deeper comprehension of the audience.
Keywords :
Machine learning, Sentiment analysis, Tweet Classification, Natural Language Processing (NLP), Machine Learning, Text Preprocessing, Tokenization, Noise Reduction, Emoticon Extraction, Sentiment Score Assignment, BiLSTM, DistilBERT, Textual Sentiment, Emoticon-Based Sentiment.