Page 158 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 158

Chapter 5 Depression discovery in cancer communities using deep learning  147




               and is used to control the input that passes through this function.
               0 means it will not allow anything to pass further in the layer, and
               1 means everything can be passed through the layer. The tangent
               function creates the new candidate vector as described earlier
               also, and the combination of sigmoid and tangent is used to finally
               update the cell of the model or, we can say, update the output of
               the model. Another version of LSTM is the bidirectional LSTM.

               4.3.1 Bidirectional long short-term memory
                  Bidirectional LSTMs are an extension of traditional LSTMs
               that can improve model performance on sequence classification
               problems. The LSTM was designed to provide the adaptive mech-
               anism that stores the past states in the nodes and keep the
               extracted features from the input stream. Basically LSTM stores
               information from inputs that has already passed through the hid-
               den layer and preserves past information as it has only seen the
               inputs from the past. But bidirectional runs the inputs in two
               ways [89]: one from the past to the future and one from the future
               to the past, and what differs this approach from unidirectional is
               that in the LSTM that runs backward, you preserve information
               from the future and, using the two hidden states combined, you
               are able, in any point in time, to preserve information from
               both the past and the future. Bi-LSTM works well with NLP for
               prediction. For example, to predict the next word in a sentence,
               on a high level, a unidirectional LSTM will see:
                  The boys went to ..
                  In the aforementioned sentence, nodes try to predict the next
               word as an outcome, whereas with Bi-LSTM in forward move,
               you will see:
                  The boys went to ..
                  And in backward move, the rest of the part of sentence is
               considered.
                  . and then they got out of the pool.
                  Now the Bi-LSTM will use both the contexts in predicting the
               next word [90], and hence, this is a powerful approach in compar-
               ison with the LSTM approach.
                  As depicted in Fig. 5.10, the Bi-LSTM is composed of input
               layer, two sets of hidden layers, and an output layer. The input
               layer is the word-embedded layer that is passed to the two hidden
               states flowing the information in backward and forward direc-
               tions. The outputs of the forward and backward directions have
               to be combined before passing them to the output layer. The op-
               tions for merging the outputs are as follows:
               • sum: The forward and backward outputs are added together.
   153   154   155   156   157   158   159   160   161   162   163