Authors :
Ravi Prasad Ravuri
Volume/Issue :
Volume 8 - 2023, Issue 6 - June
Google Scholar :
https://bit.ly/3TmGbDi
Scribd :
https://tinyurl.com/swrxmcfv
DOI :
https://doi.org/10.5281/zenodo.8146672
Abstract :
Abstract:-Text documents over Internet, social media
and in internal applications of various organizations
such as judiciary are increasing exponentially. Manual
observation of such documents and classifying them for
further processing is tedious task. There is need for
automatic text document classification. Traditional
heuristics based approaches have limitations to scale up
to the demand in terms of volumes of input documents.
To overcome this problem, machine learning (ML)
techniques are used as they can learn from the training
data and perform classification. They can also deal with
large corpus. However, existing ML models when used
directly their performance gets deteriorated due to lack
of training quality. In this paper we proposed a
framework that has a hybrid approach including feature
selection and also ML models towards leveraging
prediction performance. Our framework is named as
Learning based Text Document Classification
Framework (LbTDCF). We also proposed an algorithm
known as Intelligent Document Classification Algorithm
(IDCA) to realize our framework.
Keywords :
Machine Learning, Text Document Classification, Supervised Learning, Intelligent Document Classification
Abstract:-Text documents over Internet, social media
and in internal applications of various organizations
such as judiciary are increasing exponentially. Manual
observation of such documents and classifying them for
further processing is tedious task. There is need for
automatic text document classification. Traditional
heuristics based approaches have limitations to scale up
to the demand in terms of volumes of input documents.
To overcome this problem, machine learning (ML)
techniques are used as they can learn from the training
data and perform classification. They can also deal with
large corpus. However, existing ML models when used
directly their performance gets deteriorated due to lack
of training quality. In this paper we proposed a
framework that has a hybrid approach including feature
selection and also ML models towards leveraging
prediction performance. Our framework is named as
Learning based Text Document Classification
Framework (LbTDCF). We also proposed an algorithm
known as Intelligent Document Classification Algorithm
(IDCA) to realize our framework.
Keywords :
Machine Learning, Text Document Classification, Supervised Learning, Intelligent Document Classification