Page 139 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 139
128 Chapter 5 Depression discovery in cancer communities using deep learning
manifestation or lack of one or more words. The separation of the
data space is done recursively until the leaf nodes have some min-
imum sums of records, which are used for the purpose of
cataloging.
2.2.2 Sentiment analysis using supervised machine learning
Apart from the lexicon-based approaches, supervised learning
approaches for SA are also very popular. Supervised ML ap-
proaches require labeled training data sets on the basis of which
a model would be trained that can make predictions. Some of the
supervised ML approaches that have been used for SA are out-
lined in Table 5.2.
Authors in Ref. [32] extract sentiments from movie reviews.
They employ three ML algorithms, namely NB, MaxEnt, and
SVM. They analyze the factors that make SA more challenging
than other classification problems. They realize the need to handle
sarcasm and coreferences, which they had no provision for.
In Ref. [33], the authors use NB, SVM, and k-NN (k-nearest
neighbor) classifiers with suitable feature selection and reduction
schemes for SA of customer feedback data. They show that linear
SVMs achieve high classification accuracy on data that are even
difficult for human annotators to analyze. However, they have
no mechanism in place for dealing with sarcasm, coreferences,
and negations.
Authors in Ref. [34] introduce an ML method to perform SA of
just the subjective or sentiment bearing portions of the document.
They determine the subjective portions of the document, also
called the subjectivity extracts, using techniques for finding min-
imum cuts in graphs. Then they further integrate context-related
data, mainly concerning nearness and immediacy of different
sentences to improve this method. They show that SA using the
NB classifier on the subjectivity extracts gives better result than
SA of the original documents. This suggests that the subjectivity
extracts are shorter and cleaner representations of polarity suit-
able for use in SA systems. The main limitation of their approach
is that they use only one contextual cue which is sentence
proximity.
In Refs. [35] and [36], the authors use distant supervision. They
incorporate smileys for curating a labeled training data set and la-
bel texts with “:)” as positive and those with “:(”as negative. Next,
they train a supervised classifier on the training sets obtained in
this manner. There are two major limitations of their approach.
First is the need of more text author annotated for emoticons to