Multiple Sequences RNN/LSTM in Keras - keras

I have multiple sequences of varying length. Each has about 9 features. I want to predict the values of all the continuous features at time t+1. The data is in a list of length 2000 (so, 2000 total sequences). How could one do this in Keras?
model = Sequential()
model.add(LSTM(100, input_shape=(None,9)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X, y, epochs=1, batch_size=1, verbose=1)
This is all I really have, but I'm getting some size mismatches. Any suggestions?

Related

Keras LSTM input/output shape

I need outputs at every recurrent layer and my setup is as follows:
100 training examples, 3 time steps per example, and 20-d feature vector for each individual element.
x_train: (100,3,20)
y_train: (100,20)
LSTM architecture:
model.add(LSTM(20, input_shape=(3,20), return_sequences=True))
model.compile(loss='mean_absolute_error', optimizer='adam', metrics=['accuracy'])
model.summary()
Training:
history = model.fit(x_train, y_train, epochs=50, validation_data=(x_test, y_test))
Error:
ValueError: Dimensions must be equal, but are 20 and 3 for '{{node Equal}} = Equal[T=DT_FLOAT, incompatible_shape_error=true](IteratorGetNext:1, Cast_1)' with input shapes: [?,20], [?,3].
Please help me with the correct input/output LSTM dimensions.
Thanks
LSTM(20, input_shape=(3,20), return_sequences=True) takes as input shape (100,3,20) and returns (100,3,20). Your target output is however encoded as (100,20).
From the dimensions, I assume you want to map each sequence to a non-sequence, i.e. you can do:
LSTM(20, input_shape=(3,20), return_sequences=False)
This will return the final hidden state, i.e. a shape of (100,20) which matches your target output.

when training sample increases accuracy decreases

I am testing keras's imdb dataset. questions is, when I split to train and test for 2000 number of words I get close to 87% accuracy:
(X_train, train_labels), (X_test, test_labels) = imdb.load_data(num_words=2000)
but when I bump up the words to like 5000 or 10000, the model perform poorly:
(X_train, train_labels), (X_test, test_labels) = imdb.load_data(num_words=10000)
Here is my model:
model = models.Sequential()
model.add(layers.Dense(256, activation='relu', input_shape=(10000,)))
model.add(layers.Dense(16, activation='relu' ))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
history =model.fit(X_train, y_train, epochs=10, batch_size=64,validation_data=(x_val, y_val))
Can any one explain why this is the case. I though with more sample (and less over fitting) I should get a very good model.
Thanks for any advice
Increasing num_words doesn't increase the amount of samples but the vocabulary, leading to more words per sample (statistically), going in the direction of the curse of dimensionality, which is harmful for the model.
From the docs:
num_words: integer or None. Top most frequent words to consider. Any less frequent word will appear as oov_char value in the sequence data.

Why does CNN only predict one class

I have a model that needs to detect if a plant is dead or alive. It is only predicting one class, that data is imbalanced, but i have used weights to counter the imbalance.
I have looked at loads of questions about this problem, but none seem to work, apparently this problem occurs when overfitting, so I have used dropout. But the model still only predicts one class.
Heres the model:
model=Sequential()
# Convolutional layer / input layer
model.add(Conv2D(60, 5,5, activation='relu', input_shape=np.shape(X[1])))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Dropout(0.8))
model.add(Flatten())
model.add(Dropout(0.7))
model.add(Dense(130, activation='relu'))
model.add(Dropout(0.6))
# Output layer
model.add(Dense(2, activation='softmax'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, epochs=6, batch_size=32, class_weight=class_weight, validation_data=(X_test, y_test))
Usually it should predict both classes with 1: a healthy plant and 0:
an unhealthy plant
Since your problem is a binary classification and your output dimensionality is 2, you should change your activation to softmax.
model.add(Dense(2, activation='softmax'))
However, if you want to keep sigmoid just change your output layer units to 1, this way you will output how likely your input is one of the two classes with only one unit.
model.add(Dense(1, activation='sigmoid'))

deep learning data preparation

I have a text dataset, that contains 6 classes. for each sample, I have the percent value and sum of the 6 percent values is 100% (features are related to each other). For example :
{A:16, B:35, C:7, D:0, E:3, F:40}
how can I feed a deep learning algorithm with this dataset?
I actually want the prediction to be exactly in the shape of training data.
Here is what you can do:
First of all, normalize all of your labels and scale them between 0-1.
Use a softmax layer for prediction.
Here is some code in Keras for intuition:
model = Sequential()
model.add(Dense(100, input_dim = x.shape[1], activation='relu'))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

Keras LSTM, batch data structuring differences

I am trying to understand the differences and implications of structuring sequential data for use in the Keras LSTM model. I would like to forecast electricity demand which has a natural daily/weekly demand shape that is driven by temperature and weekday and hour of day, for example. Say I have 1 month's worth of demands and inputs i.e. array shape (30 days * 24 hours of demand, 3 features), and I want to predict the next 30 days of demand based on expected future inputs. What are the implications of the following (particularly with respect to statefulness):
#A. feed in 1 batch of 1 hour at a time. this seems the slowest to train
model.add(LSTM(n_neurons, batch_input_shape=(1, 1, 3), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(n_epoch):
model.fit(X, y, epochs=1, batch_size=1, verbose=1, shuffle=False)
model.reset_states()
#B. feed in 720 batches of 1 hour at a time
#is this the same as A, except I need to forecast 720 hours/timesteps at a time
model.add(LSTM(n_neurons, batch_input_shape=(720, 1, 3), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(n_epoch):
model.fit(X, y, epochs=1, batch_size=720, verbose=1, shuffle=False)
model.reset_states()
#C. feed in 1 batches of 720 hour timesteps at a time
#is this the same as A, except I need to forecast 720 hours/timesteps at a time
model.add(LSTM(n_neurons, batch_input_shape=(1, 720, 3), stateful=True)) #probably dont need stateful=True here (?)
model.add(Dense(720))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(n_epoch):
model.fit(X, y, epochs=1, batch_size=1, verbose=1, shuffle=False)
model.reset_states()
#D. some variation so that number_of_batches x timesteps = 720 (no overlap of sequences).
#Timesteps most likely in multiples of 24 timesteps representing hours of the day to capture the profile shape
model.add(LSTM(n_neurons, batch_input_shape=(number_of_batches, timesteps, 3), stateful=True))
model.add(Dense(timesteps))
model.compile(loss='mean_squared_error', optimizer='adam')
for i in range(n_epoch):
model.fit(X, y, epochs=1, batch_size=number_of_batches, verbose=1, shuffle=False)
model.reset_states()
I've read around so much but still don't quite get the full hang of LSTMs, so any help is much much appreciated! Any recommendations are welcome too.

Resources