LSTM multivariate problem, how to feed data to network? - keras

I have a problem with feeding my data to the (RNN) network.
De dataset contains 3600 patiƫnts who may or may not get a condition called atrial fibrillation (AF).
The dataset has 5 features each measured every 12 timesteps. An example of the dataset is given:
dataset. At timestep 12 if AF equals 1.0 then that patient has gotten at that timestep AF. Before that timestep (eg 11) the same patient hasn't got AF. If at timestep 12 AF still equals 0 then the patient hasnt gotten AF. So one could say its a binary classification problem and patients can only get AF at the final (12) timestep.
I want to feed this dataset to a LSTM, SimpleRNN or GRU from the keras library. I think I should transform the data but I have no idea how.
Directly feeding this dataset to a LSTM with a input_shape (12,5) gives the following error:
ValueError: Input 0 of layer "sequential_3" is incompatible with the layer: expected shape=(None, 12, 5), found shape=(None, 5)

Related

How to reshape the speech data as LSTM input?

I'm classifying voice and non-voice in a speech data with 3630371 data points and 39 features each. i.e shape of speech data is (3630371, 39). How do I reshape it as LSTM input. What must be the 3D input_shape or what are the values for "Samples", "Timestep" and "Features".
Is the following correct?
data.reshape(3630371, 1, 39)
LSTM(32, input_shape = (1, 39))
Please Help! I have no clue.
LSTM input: (no of samples, timesteps, features)
data.reshape(3630371, 1, 39)
LSTM(32, input_shape = (1, 39))
In the above code, you are essentially having only 1 timestep which doesn't utilize the abilities of LSTM. What you are doing is: LSTM in the first timestep takes a 39-dimensional vector as input and iteration terminate.
Another option is to give a scalar for 39 timesteps.
data.reshape(3630371, 39, 1)
Here the LSTM loops 39 times (39 timesteps) but taking a scalar as input at every timestep.
In fact any other combination will do until the no_timestep x feature_dim = total_input_dimension. Generally, it depends a lot on the domain you are working which fixes these numbers.

Predicting with different time step with a model trained for different time step data

I have trained my LSTM with 3 time steps. Following is the Keras LSTM layer.
model.add(LSTM(32, return_sequences=True, input_shape=(None, 3))).
ex:
X Y
[[1,2,3],[2,3,4],[4,5,6]] [[4],[5],[7]]
Now I need to predict the next value of a sequence with different time_steps (ex: 2)
X= [[1,2]]
When I use X= [[1,2]] I am getting following error
ValueError: Error when checking input: expected lstm_1_input to have shape (None, 3)
but got array with shape (1, 2)
Should I provide the same shape while I used for training.
Or can I still use a different timesteps (input shape) for predicting.
Appreciate your help on this issue.
I believe you need to use the same shape when using your model to predict on new data. Your data was trained with 3 timesteps (train_X), so you should feed it a 3-timesteps input when you model.predict your test data (test_X).

poor performance keras lstm

I want to create a lstm model to classify signals.
Let's say I have 1000 files of signals. Each file contains a matrix of shape (500, 5) that means that in each file, I have 5 features (columns) and 500 rows.
0 1 2 3 4
0 5 5.3 2.3 4.2 2.2
... ... ... ... ... ...
499 2500 1.2 7.4 6.7 8.6
For each file, there is one output which is a boolean (True or False). the shape is (1,)
I created a database, data, with a shape (1000, 5, 500) and the target vector is of shape (1000, 1).
Then I split data (X_train, X_test, y_train, y_test).
Is it okay to give the matrix like this to the lstm model? Because I have very poor performance. From what I have seen, people give only a 1D or 2D data and they reshape their data after to give a 3D input to the lstm layer.
The code with the lstm is like this:
input_shape=(X_train.shape[1], X_train.shape[2]) #(5,500), i.e timesteps and features
model = Sequential()
model.add(LSTM(20, return_sequences=True))
model.add(LSTM(20))
model.add(Dense(1))
model.compile(loss='mae', optimizer='adam')
I changed the number of cells in a LSTM layer and the number of layers but the score is basically the same (0.19). Is it normal to have such a bad score in my case? Is there a better way to go ?
Thanks
By transforming your data into (samples, 5, 500) you are giving the LSTM 5 timesteps and 500 features. From your data it seems you would like to process all 500 rows and 5 features of each column to make a prediction. The LSTM input is (samples, timesteps, features). So if your rows represent timesteps in which 5 measurements are taken, then you need to permute the last 2 dimensions and set input_shape=(500, 5) in the first LSTM layer.
Also since your output is Boolean, you get a more stable training by using activation='sigmoid' in your final dense layer and train with loss='binary_crossentropy for binary classification.

How to correctly get layer weights from Conv2D in keras?

I have Conv2D layer defines as:
Conv2D(96, kernel_size=(5, 5),
activation='relu',
input_shape=(image_rows, image_cols, 1),
kernel_initializer=initializers.glorot_normal(seed),
bias_initializer=initializers.glorot_uniform(seed),
padding='same',
name='conv_1')
This is the first layer in my network.
Input dimensions are 64 by 160, image is 1 channel.
I am trying to visualize weights from this convolutional layer but not sure how to get them.
Here is how I am doing this now:
1.Call
layer.get_weights()[0]
This returs an array of shape (5, 5, 1, 96). 1 is because images are 1-channel.
2.Take 5 by 5 filters by
layer.get_weights()[0][:,:,:,j][:,:,0]
Very ugly but I am not sure how to simplify this, any comments are very appreciated.
I am not sure in these 5 by 5 squares. Are they filters actually?
If not could anyone please tell how to correctly grab filters from the model?
I tried to display the weights like so only the first 25. I have the same question that you do is this the filter or something else. It doesn't seem to be the same filters that are derived from deep belief networks or stacked RBM's.
Here is the untrained visualized weights:
and here are the trained weights:
Strangely there is no change after training! If you compare them they are identical.
and then the DBN RBM filters layer 1 on top and layer 2 on bottom:
If i set kernel_intialization="ones" then I get filters that look good but the net loss never decreases though with many trial and error changes:
Here is the code to display the 2D Conv Weights / Filters.
ann = Sequential()
x = Conv2D(filters=64,kernel_size=(5,5),input_shape=(32,32,3))
ann.add(x)
ann.add(Activation("relu"))
...
x1w = x.get_weights()[0][:,:,0,:]
for i in range(1,26):
plt.subplot(5,5,i)
plt.imshow(x1w[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
ann.fit(Xtrain, ytrain_indicator, epochs=5, batch_size=32)
x1w = x.get_weights()[0][:,:,0,:]
for i in range(1,26):
plt.subplot(5,5,i)
plt.imshow(x1w[:,:,i],interpolation="nearest",cmap="gray")
plt.show()
---------------------------UPDATE------------------------
So I tried it again with a learning rate of 0.01 instead of 1e-6 and used the images normalized between 0 and 1 instead of 0 and 255 by dividing the images by 255.0. Now the convolution filters are changing and the output of the first convolutional filter looks like so:
The trained filter you'll notice is changed (not by much) with a reasonable learning rate:
Here is image seven of the CIFAR-10 test set:
And here is the output of the first convolution layer:
And if I take the last convolution layer (no dense layers in between) and feed it to a classifier untrained it is similar to classifying raw images in terms of accuracy but if I train the convolution layers the last convolution layer output increases the accuracy of the classifier (random forest).
So I would conclude the convolution layers are indeed filters as well as weights.
In layer.get_weights()[0][:,:,:,:], the dimensions in [:,:,:,:] are x position of the weight, y position of the weight, the n th input to the corresponding conv layer (coming from the previous layer, note that if you try to obtain the weights of first conv layer then this number is 1 because only one input is driven to the first conv layer) and k th filter or kernel in the corresponding layer, respectively. So, the array shape returned by layer.get_weights()[0] can be interpreted as only one input is driven to the layer and 96 filters with 5x5 size are generated. If you want to reach one of the filters, you can type, lets say the 6th filter
print(layer.get_weights()[0][:,:,:,6].squeeze()).
However, if you need the filters of the 2nd conv layer (see model image link attached below), then notice for each of 32 input images or matrices you will have 64 filters. If you want to get the weights of any of them for example weights of the 4th filter generated for the 8th input image, then you should type
print(layer.get_weights()[0][:,:,8,4].squeeze()).
enter image description here

Keras: LSTM with class weights

my question is quite closely related to this question but also goes beyond it.
I am trying to implement the following LSTM in Keras where
the number of timesteps be nb_tsteps=10
the number of input features is nb_feat=40
the number of LSTM cells at each time step is 120
the LSTM layer is followed by TimeDistributedDense layers
From the question referenced above I understand that I have to present the input data as
nb_samples, 10, 40
where I get nb_samples by rolling a window of length nb_tsteps=10 across the original timeseries of shape (5932720, 40). The code is hence
model = Sequential()
model.add(LSTM(120, input_shape=(X_train.shape[1], X_train.shape[2]),
return_sequences=True, consume_less='gpu'))
model.add(TimeDistributed(Dense(50, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(20, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(10, activation='relu')))
model.add(Dropout(0.2))
model.add(TimeDistributed(Dense(3, activation='relu')))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
Now to my question (assuming the above is correct so far):
The binary responses (0/1) are heavily imbalanced and I need to pass a class_weight dictionary like cw = {0: 1, 1: 25} to model.fit(). However I get an exception class_weight not supported for 3+ dimensional targets. This is because I present the response data as (nb_samples, 1, 1). If I reshape it into a 2D array (nb_samples, 1) I get the exception Error when checking model target: expected timedistributed_5 to have 3 dimensions, but got array with shape (5932720, 1).
Thanks a lot for any help!
I think you should use sample_weight with sample_weight_mode='temporal'.
From the Keras docs:
sample_weight: Numpy array of weights for the training samples, used
for scaling the loss function (during training only). You can either
pass a flat (1D) Numpy array with the same length as the input samples
(1:1 mapping between weights and samples), or in the case of temporal
data, you can pass a 2D array with shape (samples, sequence_length),
to apply a different weight to every timestep of every sample. In this
case you should make sure to specify sample_weight_mode="temporal" in
compile().
In your case you would need to supply a 2D array with the same shape as your labels.
If this is still an issue.. I think the TimeDistributed Layer expects and returns a 3D array (kind of similar to if you have return_sequences=True in the regular LSTM layer). Try adding a Flatten() layer or another LSTM layer at the end before the prediction layer.
d = TimeDistributed(Dense(10))(input_from_previous_layer)
lstm_out = Bidirectional(LSTM(10))(d)
output = Dense(1, activation='sigmoid')(lstm_out)
Using temporal is a workaround. Check out this stack. The issue is also documented on github.

Resources