Large amounts of text data is generated on a
daily basis through social media posts, reviews, emails,
blogs and search queries etc. Most of this text data is
unstructured. To help make sense of this large amount of
text data we need keyword extraction which helps in
obtaining the important word(keywords) or important
phrases(key phrases) without having to go through all the
text data ourselves.
However over the years it has been found to be quite
difficult to extract keywords from short text (text
spanning across one or maybe two sentences) and many
of the traditional methods such as classification, RAKE,
TextRank and TF-IDF have been found to be not as
effective as we would wish them to be. In this paper, we
compare the traditional methods and also propose an new
Neural Network based algorithm, such as the sequence to
sequence based encoder-decoder model which we show in
this paper, for better keyword extraction form short text.
We conduct some preliminary application based
investigation on some sentences, which show the
superiority of neural network method and can also form
the basis of future research.
Keywords : extraction, text, data, short text, neural network, NLP