Identification of Characteristics of Covid-19 Infection Using the K-Means Clustering Method


Authors : Rina Fitriana; Yanto; Isdaryanto Iskandar

Volume/Issue : Volume 7 - 2022, Issue 9 - September

Google Scholar : https://bit.ly/3IIfn9N

Scribd : https://bit.ly/3SEtgLr

DOI : https://doi.org/10.5281/zenodo.7117881

Abstract : 2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-toperson spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people. The purpose of this study is to identify the type of data on the COVID-19 outbreak. Based on the outbreak of COVID-19 in the several area around the first identified cases, datasets for the infection based on several criteria have been made. The criteria of datasets include: reporting date; location; country; gender; and age. It evaluates how the data going to be grouped into several similar characteristics, so the report for this new viruses can be identified. Within those criteria, the data going to be analyzed with the clustering method which is specifically the k-means clustering. The k-means will group the data based on the similarity between each data for the purpose of visualizing the COVID-19 undefined data. The results obtained from the Kaggle study were data on the COVID-19 virus infection. In designing data mining, it uses the K-means clustering

Keywords : COVID-19, Data Mining, Clustering, K-means

2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-toperson spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people. The purpose of this study is to identify the type of data on the COVID-19 outbreak. Based on the outbreak of COVID-19 in the several area around the first identified cases, datasets for the infection based on several criteria have been made. The criteria of datasets include: reporting date; location; country; gender; and age. It evaluates how the data going to be grouped into several similar characteristics, so the report for this new viruses can be identified. Within those criteria, the data going to be analyzed with the clustering method which is specifically the k-means clustering. The k-means will group the data based on the similarity between each data for the purpose of visualizing the COVID-19 undefined data. The results obtained from the Kaggle study were data on the COVID-19 virus infection. In designing data mining, it uses the K-means clustering

Keywords : COVID-19, Data Mining, Clustering, K-means

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.
Subscribe
OR

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.
Subscribe