Page 141 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 141
130 Chapter 5 Depression discovery in cancer communities using deep learning
perform SA with a reliable degree of accuracy. Second is the pres-
ence of noisy samples in the training data, which may hinder the
performance of SA.
Authors in Ref. [37] examine the performance of different ML
algorithms such as NB, SVM, J48, sequential mining optimization
(SMO), and so on on Twitter data. They conclude that classifica-
tion of Twitter data is more complex than other text classification.
SMO, SVM, and random forest display acceptable performance,
but the NB classifier fails to perform. Moreover, there was no pro-
vision for handling negation words such as no, not, and never.
In Ref. [38], the authors use pointwise mutual information
(PMI) to improve SA and determine top-n nouns and noun
phrases as extended objects. They further consider tweets articu-
lating positive or negative sentiments toward the extended objects
as positive or negative about the object. They use a dependency
parser to generate a feature set for classification. A dependency
parser investigates sentence structure, instituting associations in
between root words and words that alter the roots. They use
SVM for subjectivity classification. Their approach significantly
outperforms the previous target independent classifiers. Howev-
er, they are not able to deal with sarcastic text.
Authors in Ref. [39] propose a hierarchical classifier, which cat-
egorizes blogs into six emotional types and one nonemotional
type using SVM. At the first level, the classification is into
emotional and nonemotional classes. Then the emotional in-
stances are classified for polaritydpositive or negative. Finally,
the third level is classification into emotions. The major drawback
of their approach is that they consider same sets of features and
classification methods for each task.
In Ref. [40], the authors develop a system for identifying and
retrieving hotel reviews on the web. They perform SA on the re-
views and also generate summaries of the remarks obtained.
These summaries may be used by the hotel management for qual-
ity control. They show that further research is necessary for
demarcation of neutral text as well as for handling multitopic
segments.
Authors in Ref. [41] apply a hierarchical classification approach
similar to Ref. [39] for sentiment and emotion analysis of Twitter
posts pertaining to the 2011 season of the Brazilian Soccer League.
They conclude that the classification accuracy could be greatly
improved by incorporating additional features derived from
emotional lexicons.
Authors in Ref. [42] use three ML algorithms, namely, NB, SVM,
and linear regression and implemented unigrams, bigrams,
3-grams and 4-grams features and evaluate the features to