Authors :
Rina Fitriana; Yanto; Isdaryanto Iskandar
Volume/Issue :
Volume 7 - 2022, Issue 9 - September
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3SEtgLr
DOI :
https://doi.org/10.5281/zenodo.7117881
Abstract :
2019 Novel Coronavirus (2019-nCoV) is a
virus (more specifically, a coronavirus) identified as the
cause of an outbreak of respiratory illness first detected
in Wuhan, China. Early on, many of the patients in the
outbreak in Wuhan, China reportedly had some link to a
large seafood and animal market, suggesting animal-toperson spread. However, a growing number of patients
reportedly have not had exposure to animal markets,
indicating person-to-person spread is occurring. At this
time, it’s unclear how easily or sustainably this virus is
spreading between people. The purpose of this study is to
identify the type of data on the COVID-19 outbreak.
Based on the outbreak of COVID-19 in the several area
around the first identified cases, datasets for the infection
based on several criteria have been made. The criteria of
datasets include: reporting date; location; country;
gender; and age. It evaluates how the data going to be
grouped into several similar characteristics, so the report
for this new viruses can be identified. Within those
criteria, the data going to be analyzed with the clustering
method which is specifically the k-means clustering. The
k-means will group the data based on the similarity
between each data for the purpose of visualizing the
COVID-19 undefined data. The results obtained from the
Kaggle study were data on the COVID-19 virus infection.
In designing data mining, it uses the K-means clustering
Keywords :
COVID-19, Data Mining, Clustering, K-means
2019 Novel Coronavirus (2019-nCoV) is a
virus (more specifically, a coronavirus) identified as the
cause of an outbreak of respiratory illness first detected
in Wuhan, China. Early on, many of the patients in the
outbreak in Wuhan, China reportedly had some link to a
large seafood and animal market, suggesting animal-toperson spread. However, a growing number of patients
reportedly have not had exposure to animal markets,
indicating person-to-person spread is occurring. At this
time, it’s unclear how easily or sustainably this virus is
spreading between people. The purpose of this study is to
identify the type of data on the COVID-19 outbreak.
Based on the outbreak of COVID-19 in the several area
around the first identified cases, datasets for the infection
based on several criteria have been made. The criteria of
datasets include: reporting date; location; country;
gender; and age. It evaluates how the data going to be
grouped into several similar characteristics, so the report
for this new viruses can be identified. Within those
criteria, the data going to be analyzed with the clustering
method which is specifically the k-means clustering. The
k-means will group the data based on the similarity
between each data for the purpose of visualizing the
COVID-19 undefined data. The results obtained from the
Kaggle study were data on the COVID-19 virus infection.
In designing data mining, it uses the K-means clustering
Keywords :
COVID-19, Data Mining, Clustering, K-means