Authors :
Dilushinie Narmada Fernando; Dr. Lakmal Rupasinghe
Volume/Issue :
Volume 7 - 2022, Issue 3 - March
Google Scholar :
https://bit.ly/3IIfn9N
Scribd :
https://bit.ly/3tRuIAo
DOI :
https://doi.org/10.5281/zenodo.6395394
Abstract :
Nowadays, when protecting the information of
an organization, professionals would consider the level of
confidentiality and sensitivity of the data as a major
concern. This is reflected in a manual process where
ideas, decisions, and expectations of the data owners and
other professionals classify data according to their
perspectives. The classification of data will depend on the
decisions made by humans and expose sensitive data to
many users who are unauthorized to access and alter it.
This research was developed to reduce the involvement of
humans in making decisions on data classification and
divided them into different clusters according to the level
of confidentiality. The system divides documents into 3
major categories, such as confidential, sensitive, and
public data, using the unsupervised self-organizing map
method, which is an artificial neural network originally
designed for the clustering of high-dimensional data
samples onto a low-dimensional map.
Keywords :
Information Technology, Intellectual Property, Self-Organizing Map, Information retrieval, Statistical Natural Language Processing
Nowadays, when protecting the information of
an organization, professionals would consider the level of
confidentiality and sensitivity of the data as a major
concern. This is reflected in a manual process where
ideas, decisions, and expectations of the data owners and
other professionals classify data according to their
perspectives. The classification of data will depend on the
decisions made by humans and expose sensitive data to
many users who are unauthorized to access and alter it.
This research was developed to reduce the involvement of
humans in making decisions on data classification and
divided them into different clusters according to the level
of confidentiality. The system divides documents into 3
major categories, such as confidential, sensitive, and
public data, using the unsupervised self-organizing map
method, which is an artificial neural network originally
designed for the clustering of high-dimensional data
samples onto a low-dimensional map.
Keywords :
Information Technology, Intellectual Property, Self-Organizing Map, Information retrieval, Statistical Natural Language Processing