Classify a sequence using LSTM in keras - keras

I am working on a binary classification problem, where the network takes two inputs and output the label of this input pair.
Basically, I use an encoder layer to do embedding first and concatenate the embedding results. Next, I am going to use RNN structure to classify the concatenated result. But I can't figure out a proper way to write the code. I attach my code below.
input_size = n_feature # the number of features
encoder_size = 2000 # output dim for each encoder
dropout_rate = 0.5
X1 = Input(shape=(input_size, ), name='input_1')
X2 = Input(shape=(input_size, ), name='input_2')
encoder = Sequential()
encoder.add(Dropout(dropout_rate, input_shape=(input_size, )))
encoder.add(Dense(encoder_size, activation='relu'))
encoded_1 = encoder(X1)
encoded_2 = encoder(X2)
merged = concatenate([encoded_1, encoded_2])
#----------Need Help---------------#
comparer = Sequential()
comparer.add(LSTM(512, input_shape=(encoder_size*2, ), return_sequences=True))
comparer.add(Dropout(dropout_rate))
comparer.add(TimeDistributed(Dense(1)))
comparer.add(Activation('sigmoid'))
#----------Need Help---------------#
Y = comparer(merged)
model = Model(inputs=[X1, X2], outputs=Y)
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
It seems for the LSTM layer, the input should be (None, encoder_size*2). I tried to use Y = comparer(K.transpose(merged)) to reshape the input for the LSTM layer but I failed. BTW, for this network, the input shape is (input_size,) and output shape is (1,).

If the idea is to transform the input vector in a time series, you can simply reshape it:
comparer = Sequential()
#reshape the vector into a time series form: (None, timeSteps, features)
comparer.add(Reshape((2 * encoder_size,1), input_shape=(2*encoder_size,))
#don't return sequences, you don't want a sequence as result:
comparer.add(LSTM(512, return_sequences=False))
comparer.add(Dropout(dropout_rate))
#Don't use a TimeDistributed, you're not dealing with a series anymore
comparer.add(Dense(1))
comparer.add(Activation('sigmoid'))

Related

How do I decode the output of my seq-to-seq model if I'm using an embedding layer?

I have a seq to seq model trained of some clever bot data:
justphrases_X is a list of sentences and justphrases_Y is a list of responses to those sentences.
maxlen = 62
#low is a list of all the unique words.
def Convert_To_Encoding(just_phrases):
encodings = []
for sentence in just_phrases:
onehotencoded = one_hot(sentence, len(low))
encodings.append(np.array(onehotencoded))
encodings_padded = pad_sequences(encodings, maxlen=maxlen, padding='post', value = 0.0)
return encodings_padded
encodings_X_padded = Convert_To_Encoding(just_phrases_X)
encodings_y_padded = Convert_To_Encoding(just_phrases_y)
model = Sequential()
embedding_layer = Embedding(len(low), output_dim=8, input_length=maxlen)
model.add(embedding_layer)
model.add(GRU(128)) # input_shape=(None, 496)
model.add(RepeatVector(numberofwordsoutput)) #number of characters?
model.add(GRU(128, return_sequences = True))
model.add(Flatten())
model.add(Dense(62, activation = 'softmax'))
model.compile(loss = 'categorical_crossentropy', optimizer= 'adam', metrics=['accuracy'])
model.summary()
model.fit(encodings_X_padded, encodings_y_padded, batch_size = 1, epochs=1) #, validation_data = (testX, testy)
model.save("cleverbottheseq-uel.h5")
When I use this model for prediction, the output will be between 0 and 1 because of my use of softmax. However as I have around 3000 unique words, each with a separate integer assigned to it, how do I essentially repeat what the model did during training and convert the output back to an integer which has a word assigned to it?
I dont think it is possible to create seq2seq with Sequential API. Try to create encoder and decoder separately with Functional API. You need two inputs - first for encoder, second - for decoder.

Change label format for training

Normally, if you train with keras, model.fit expects the train data to have a shape of (samples, timesteps, input) and a label of (samples, outputs). Is there a way to change the matching label to (samples*timesteps, output) or (samples, timesteps, input). So one sample matches len(sample)*label and not only one label?
Yes. You can have whatever shape you want as the output layer. For instance auto-encoders will have the same output shape as input shape.
A toy example:
sequence_length = 20
n_features = 4
def make_model():
inp = Input(shape=(sequence_length, n_features,))
encoder = LSTM(16, return_sequences=True)(inp)
vector = LSTM(32)(encoder)
decoder_in = RepeatVector(sequence_length)(vector)
decoder = LSTM(16, return_sequences=True)(decoder_in)
out = Dense(4)(decoder)
model = Model(inp, out)
model.compile('adam', 'mse')
return model
model = make_model()
model.summary()
In this case the vector layer has shape (32,) (i.e. there is a dimensionality reduction compared to the input) and the output layer has the same dimensions as the input.

How to merge new features at later stage of model?

I have training data in the form of numpy arrays, that I will use in ConvLSTM.
Following are dimensions of array.
trainX = (5000, 200, 5) where 5000 are number of samples. 200 is time steps per sample, and 8 is number of features per timestep. (samples, timesteps, features).
out of these 8 features, 3 features remains the same throghout all timesteps in a sample (In other words, these features are directly related to samples). for example, day of the week, month number, weekday (these changes from sample to sample). To reduce the complexity, I want to keep these three features separate from initial training set and merge them with the output of convlstm layer before applying dense layer for classication (softmax activiation). e,g
Intial training set dimension would be (7000, 200, 5) and auxiliary input dimensions to be merged would be (7000, 3) --> because these 3 features are directly related to sample. How can I implement this using keras?
Following is my code that I write using Functional API, but don't know how to merge these two inputs.
#trainX.shape=(7000,200,5)
#trainy.shape=(7000,4)
#testX.shape=(3000,200,5)
#testy.shape=(3000,4)
#trainMetadata.shape=(7000,3)
#testMetadata.shape=(3000,3)
verbose, epochs, batch_size = 1, 50, 256
samples, n_features, n_outputs = trainX.shape[0], trainX.shape[2], trainy.shape[1]
n_steps, n_length = 4, 50
input_shape = (n_steps, 1, n_length, n_features)
model_input = Input(shape=input_shape)
clstm1 = ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',return_sequences = True)(model_input)
clstm1 = BatchNormalization()(clstm1)
clstm2 = ConvLSTM2D(filters=128, kernel_size=(1,3), activation='relu',return_sequences = False)(clstm1)
conv_output = BatchNormalization()(clstm2)
metadata_input = Input(shape=trainMetadata.shape)
merge_layer = np.concatenate([metadata_input, conv_output])
dense = Dense(100, activation='relu', kernel_regularizer=regularizers.l2(l=0.01))(merge_layer)
dense = Dropout(0.5)(dense)
output = Dense(n_outputs, activation='softmax')(dense)
model = Model(inputs=merge_layer, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit([trainX, trainMetadata], trainy, validation_data=([testX, testMetadata], testy), epochs=epochs, batch_size=batch_size, verbose=verbose)
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
y = model.predict(testX)
but I am getting Value error at merge_layer statement. Following is the ValueError
ValueError: zero-dimensional arrays cannot be concatenated
What you are saying can not be done using the Sequential mode of Keras.
You need to use the Model class API Guide to Keras Model.
With this API you can build the complex model you are looking for
Here you have an example of how to use it: How to Use the Keras Functional API for Deep Learning

Is it possible to train using same model with two inputs?

Hello I have a some question for keras.
currently i want implement some network
using same cnn model, and use two images as input of cnn model
and use two result of cnn model, provide to Dense model
for example
def cnn_model():
input = Input(shape=(None, None, 3))
x = Conv2D(8, (3, 3), strides=(1, 1))(input)
x = GlobalAvgPool2D()(x)
model = Model(input, x)
return model
def fc_model(cnn1, cnn2):
input_1 = cnn1.output
input_2 = cnn2.output
input = concatenate([input_1, input_2])
x = Dense(1, input_shape=(None, 16))(input)
x = Activation('sigmoid')(x)
model = Model([cnn1.input, cnn2.input], x)
return model
def main():
cnn1 = cnn_model()
cnn2 = cnn_model()
model = fc_model(cnn1, cnn2)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x=[image1, image2], y=[1.0, 1.0], batch_size=1, ecpochs=1)
i want to implement model something like this, and train models
but i got error message like below :
'All layer names should be unique'
Actually i want use only one CNN model as feature extractor and finally use two features to predict one float value as 0.0 ~ 1.0
so whole system -->>
using two images and extract features from same CNN model, and features are provided to Dense model to get one floating value
Please, help me implement this system and how to train..
Thank you
See the section of the Keras documentation on shared layers:
https://keras.io/getting-started/functional-api-guide/
A code snippet from the documentation above demonstrating this:
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

Keras LSTM layers input shape

I am trying to feed a sequence with 20 featuresto an LSTM network as shown in the code. But I get an error that my Input0 is incompatible with LSTM input. Not sure how to change my layer structure to fit the data.
def build_model(features, aux1=None, aux2=None):
# create model
features[0] = np.asarray(features[0])
main_input = Input(shape=features[0].shape, dtype='float32', name='main_input')
main_out = LSTM(40, activation='relu')
aux1_input = Input(shape=(len(aux1[0]),), dtype='float32', name='aux1_input')
aux1_out = Dense(len(aux1[0]))(aux1_input)
aux2_input = Input(shape=(len(aux2[0]),), dtype='float32', name='aux2_input')
aux2_out = Dense(len(aux2[0]))(aux2_input)
x = concatenate([aux1_out, main_out, aux2_out])
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(1, activation='sigmoid', name='main_output')(x)
model = Model(inputs=[aux1_input, aux2_input, main_input], outputs= [output])
return model
Features variable is an array of shape (1456, 20) I have 1456 days and for each day I have 20 variables.
Your main_input should be of shape (samples, timesteps, features)
and then you should define main_input like this:
main_input = Input(shape=(timesteps,)) # for stateless RNN (your one)
or main_input = Input(batch_shape=(batch_size, timesteps,)) for stateful RNN (not the one you are using in your example)
if your features[0] is a 1-dimensional array of various features (1 timestep), then you also have to reshape features[0] like this:
features[0] = np.reshape(features[0], (1, features[0].shape))
and then do it to features[1], features[2] etc
or better reshape all your samples at once:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
LSTM layers are designed to work with "sequences".
You say your sequence has 20 features, but how many time steps does it have?? Do you mean 20 time steps instead?
An LSTM layer requires input shapes such as (BatchSize, TimeSteps, Features).
If it's the case that you have 1 feature in each of the 20 time steps, you must shape your data as:
inputData = someData.reshape(NumberOfSequences, 20, 1)
And the Input tensor should take this shape:
main_input = Input((20,1), ...) #yes, it ignores the batch size

Resources