Page 139 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 139

128   Chapter 5 Depression discovery in cancer communities using deep learning




                                    manifestation or lack of one or more words. The separation of the
                                    data space is done recursively until the leaf nodes have some min-
                                    imum sums of records, which are used for the purpose of
                                    cataloging.


                                    2.2.2 Sentiment analysis using supervised machine learning
                                       Apart from the lexicon-based approaches, supervised learning
                                    approaches for SA are also very popular. Supervised ML ap-
                                    proaches require labeled training data sets on the basis of which
                                    a model would be trained that can make predictions. Some of the
                                    supervised ML approaches that have been used for SA are out-
                                    lined in Table 5.2.
                                       Authors in Ref. [32] extract sentiments from movie reviews.
                                    They employ three ML algorithms, namely NB, MaxEnt, and
                                    SVM. They analyze the factors that make SA more challenging
                                    than other classification problems. They realize the need to handle
                                    sarcasm and coreferences, which they had no provision for.
                                       In Ref. [33], the authors use NB, SVM, and k-NN (k-nearest
                                    neighbor) classifiers with suitable feature selection and reduction
                                    schemes for SA of customer feedback data. They show that linear
                                    SVMs achieve high classification accuracy on data that are even
                                    difficult for human annotators to analyze. However, they have
                                    no mechanism in place for dealing with sarcasm, coreferences,
                                    and negations.
                                       Authors in Ref. [34] introduce an ML method to perform SA of
                                    just the subjective or sentiment bearing portions of the document.
                                    They determine the subjective portions of the document, also
                                    called the subjectivity extracts, using techniques for finding min-
                                    imum cuts in graphs. Then they further integrate context-related
                                    data, mainly concerning nearness and immediacy of different
                                    sentences to improve this method. They show that SA using the
                                    NB classifier on the subjectivity extracts gives better result than
                                    SA of the original documents. This suggests that the subjectivity
                                    extracts are shorter and cleaner representations of polarity suit-
                                    able for use in SA systems. The main limitation of their approach
                                    is that they use only one contextual cue which is sentence
                                    proximity.
                                       In Refs. [35] and [36], the authors use distant supervision. They
                                    incorporate smileys for curating a labeled training data set and la-
                                    bel texts with “:)” as positive and those with “:(”as negative. Next,
                                    they train a supervised classifier on the training sets obtained in
                                    this manner. There are two major limitations of their approach.
                                    First is the need of more text author annotated for emoticons to
   134   135   136   137   138   139   140   141   142   143   144