Page 141 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications

P. 141

130 Chapter 5 Depression discovery in cancer communities using deep learning

perform SA with a reliable degree of accuracy. Second is the pres-
ence of noisy samples in the training data, which may hinder the
performance of SA.
Authors in Ref. [37] examine the performance of different ML
algorithms such as NB, SVM, J48, sequential mining optimization
(SMO), and so on on Twitter data. They conclude that classiﬁca-
tion of Twitter data is more complex than other text classiﬁcation.
SMO, SVM, and random forest display acceptable performance,
but the NB classiﬁer fails to perform. Moreover, there was no pro-
vision for handling negation words such as no, not, and never.
In Ref. [38], the authors use pointwise mutual information
(PMI) to improve SA and determine top-n nouns and noun
phrases as extended objects. They further consider tweets articu-
lating positive or negative sentiments toward the extended objects
as positive or negative about the object. They use a dependency
parser to generate a feature set for classiﬁcation. A dependency
parser investigates sentence structure, instituting associations in
between root words and words that alter the roots. They use
SVM for subjectivity classiﬁcation. Their approach signiﬁcantly
outperforms the previous target independent classiﬁers. Howev-
er, they are not able to deal with sarcastic text.
Authors in Ref. [39] propose a hierarchical classiﬁer, which cat-
egorizes blogs into six emotional types and one nonemotional
type using SVM. At the ﬁrst level, the classiﬁcation is into
emotional and nonemotional classes. Then the emotional in-
stances are classiﬁed for polaritydpositive or negative. Finally,
the third level is classiﬁcation into emotions. The major drawback
of their approach is that they consider same sets of features and
classiﬁcation methods for each task.
In Ref. [40], the authors develop a system for identifying and
retrieving hotel reviews on the web. They perform SA on the re-
views and also generate summaries of the remarks obtained.
These summaries may be used by the hotel management for qual-
ity control. They show that further research is necessary for
demarcation of neutral text as well as for handling multitopic
segments.
Authors in Ref. [41] apply a hierarchical classiﬁcation approach
similar to Ref. [39] for sentiment and emotion analysis of Twitter
posts pertaining to the 2011 season of the Brazilian Soccer League.
They conclude that the classiﬁcation accuracy could be greatly
improved by incorporating additional features derived from
emotional lexicons.
Authors in Ref. [42] use three ML algorithms, namely, NB, SVM,
and linear regression and implemented unigrams, bigrams,
3-grams and 4-grams features and evaluate the features to

136 137 138 139 140 141 142 143 144 145 146