How can I modify gates (forget, input etc.) in keras LSTM layer?How can I modify gates (forget, input etc.) in keras LSTM layer?
If you mean to forget the cell state as the model train on data, you could use this function to clear it keras.backend.clear_session()
Let's suppose we want to use in our model the pre-trained weights of VGG16 up to the layer before the third max pooling and then add the layers of our choice, how could we make this happen?
VGG16 architecture overview
You can to create a new model with say, base_model (VGG model with loaded weights and the unwanted layers 'pop()'ped). Then add VGG and other layers of your choice to the empty sequential model
I have a couple of questions about LSTM layers in Keras library
In LSTM layer we have two kind of dropouts: dropout and recurrent-dropout. According to my understanding the first one will drop randomly some features from input (set them to zero) and the second one will do it on hidden units (features of h_t). Since we have different time steps in a LSTM network, is dropping applied seperately to each time step or only one time and will be the same for every step?
My second question is about regularizers in LSTM layer in keras. I know that for example the kernel regularizer will regularize weights corresponding to inputs. but we have different weights for inputs.
For example input gate, update gate and output gates use different weights for input (aslo different weights for h_(t-1)) . So will they be regularized in the same time ? What if I want to regularize only one of them? For example if I want to regularize only the weights used in the formula for input gate.
The last question is about activation functions in keras. In LSTM layer I have activation and recurrent activations. activation is tanh by default and I know in LSTM architecture tanh is used two times (for h_t and candidate of memory cell) and sigmoid is used 3 times (for gates). So does that mean if I change tanh (in LSTM layer in keras) to another function say Relu then it will change for both of h_t and memory cell candidate?
It would be perfect if any of those question could be answered. Thank you very much for your time.
I'm currently trying to set up a (LSTM) recurrent neural network with Keras (tensorflow backend).
I would like to use variational dropout with MC Dropout on it.
I believe that variational dropout is already implemented with the option "recurrent_dropout" of the LSTM layer but I don't find any way to set a "training" flag to put on to true like a classical Dropout layer.
This is quite easy in Keras, first you need to define a function that takes both model input and the learning_phase:
import keras.backend as K
f = K.function([model.layers[0].input, K.learning_phase()],
For a Functional API model with multiple inputs/outputs you can use:
f = K.function([model.inputs, K.learning_phase()],
Then you can call this function like f([input, 1]) and this will tell Keras to enable the learning phase during this call, executing Dropout. Then you can call this function multiple times and combine the predictions to estimate uncertainty.
The source code for "Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning" (2015) is located at They also use Keras and the code is quite easy to understand. The Dropout layers are used without the Sequential api in order to pass the training parameter. This is a different approach to the suggestion from Matias:
inter = Dropout(dropout_rate)(inter, training=True)
My question is does the this code make sense? And if this makes sense what should be the purpose?
model.add(LSTM(18, return_sequences=True,batch_input_shape=(batch_size,look_back,dim_x), stateful=True))
model.add(Dense(1, activation='linear'))
Because if my first LSTM layer returns my state from one batch to the next, why shouldn't do my second LSTM layer the same?
I'm having a hard time to understand the LSTM mechanics in Keras so I'm very thankful for any kind of help :)
And if you down vote this post could you tell me why in the commands? thanks.
Your program is a regression problem where your model consists of 2 lstm layers with 18 and 50 layers each and finally a dense layer to show the regression value.
LSTM requires a 3D input.Since the output of your first LSTM layer is going to the input for the second LSTM layer.The input of the Second LSTM layer should also be in 3D. so we set the retrun sequence as true in 1st as it will return a 3D output which can then be used as an input for the second LSTM.
Your second LSTMs value does not return a sequence because after the second LSTM you have a dense layer which does not need a 3D value as input.
In keras by default LSTM states are reset after each batch of training data,so if you don't want the states to be reset after each batch you can set the stateful=True. If LSTM is made stateful final state of a batch will be used as an initial state for the next batch.
You can later reset the states by calling reset_states()
How can I set up a keras model such that the final LSTM layer outputs a prediction for each time step while having variable sequence lengths as input?
I'd then like to provide labels for each of the timesteps after a dense layer with linear activation.
When I try to add a reshape or a dense layer to the LSTM model that is returning the full sequence and has a masking layer to take care of variable sequence lengths, it says:
The reshape and the dense layers do not support masking.
Would this be possible to do?
You can use the TimeDistributed layer wrapper for this. This applies the layer you want to each timestep. In your case, you could also just use TimeDistributedDense.