Page 158 - Handbook of Deep Learning in Biomedical Engineering Techniques and Applications
P. 158
Chapter 5 Depression discovery in cancer communities using deep learning 147
and is used to control the input that passes through this function.
0 means it will not allow anything to pass further in the layer, and
1 means everything can be passed through the layer. The tangent
function creates the new candidate vector as described earlier
also, and the combination of sigmoid and tangent is used to finally
update the cell of the model or, we can say, update the output of
the model. Another version of LSTM is the bidirectional LSTM.
4.3.1 Bidirectional long short-term memory
Bidirectional LSTMs are an extension of traditional LSTMs
that can improve model performance on sequence classification
problems. The LSTM was designed to provide the adaptive mech-
anism that stores the past states in the nodes and keep the
extracted features from the input stream. Basically LSTM stores
information from inputs that has already passed through the hid-
den layer and preserves past information as it has only seen the
inputs from the past. But bidirectional runs the inputs in two
ways [89]: one from the past to the future and one from the future
to the past, and what differs this approach from unidirectional is
that in the LSTM that runs backward, you preserve information
from the future and, using the two hidden states combined, you
are able, in any point in time, to preserve information from
both the past and the future. Bi-LSTM works well with NLP for
prediction. For example, to predict the next word in a sentence,
on a high level, a unidirectional LSTM will see:
The boys went to ..
In the aforementioned sentence, nodes try to predict the next
word as an outcome, whereas with Bi-LSTM in forward move,
you will see:
The boys went to ..
And in backward move, the rest of the part of sentence is
considered.
. and then they got out of the pool.
Now the Bi-LSTM will use both the contexts in predicting the
next word [90], and hence, this is a powerful approach in compar-
ison with the LSTM approach.
As depicted in Fig. 5.10, the Bi-LSTM is composed of input
layer, two sets of hidden layers, and an output layer. The input
layer is the word-embedded layer that is passed to the two hidden
states flowing the information in backward and forward direc-
tions. The outputs of the forward and backward directions have
to be combined before passing them to the output layer. The op-
tions for merging the outputs are as follows:
• sum: The forward and backward outputs are added together.