How to merge new features at later stage of model? - keras

I have training data in the form of numpy arrays, that I will use in ConvLSTM.
Following are dimensions of array.
trainX = (5000, 200, 5) where 5000 are number of samples. 200 is time steps per sample, and 8 is number of features per timestep. (samples, timesteps, features).
out of these 8 features, 3 features remains the same throghout all timesteps in a sample (In other words, these features are directly related to samples). for example, day of the week, month number, weekday (these changes from sample to sample). To reduce the complexity, I want to keep these three features separate from initial training set and merge them with the output of convlstm layer before applying dense layer for classication (softmax activiation). e,g
Intial training set dimension would be (7000, 200, 5) and auxiliary input dimensions to be merged would be (7000, 3) --> because these 3 features are directly related to sample. How can I implement this using keras?
Following is my code that I write using Functional API, but don't know how to merge these two inputs.
#trainX.shape=(7000,200,5)
#trainy.shape=(7000,4)
#testX.shape=(3000,200,5)
#testy.shape=(3000,4)
#trainMetadata.shape=(7000,3)
#testMetadata.shape=(3000,3)
verbose, epochs, batch_size = 1, 50, 256
samples, n_features, n_outputs = trainX.shape[0], trainX.shape[2], trainy.shape[1]
n_steps, n_length = 4, 50
input_shape = (n_steps, 1, n_length, n_features)
model_input = Input(shape=input_shape)
clstm1 = ConvLSTM2D(filters=64, kernel_size=(1,3), activation='relu',return_sequences = True)(model_input)
clstm1 = BatchNormalization()(clstm1)
clstm2 = ConvLSTM2D(filters=128, kernel_size=(1,3), activation='relu',return_sequences = False)(clstm1)
conv_output = BatchNormalization()(clstm2)
metadata_input = Input(shape=trainMetadata.shape)
merge_layer = np.concatenate([metadata_input, conv_output])
dense = Dense(100, activation='relu', kernel_regularizer=regularizers.l2(l=0.01))(merge_layer)
dense = Dropout(0.5)(dense)
output = Dense(n_outputs, activation='softmax')(dense)
model = Model(inputs=merge_layer, outputs=output)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
history = model.fit([trainX, trainMetadata], trainy, validation_data=([testX, testMetadata], testy), epochs=epochs, batch_size=batch_size, verbose=verbose)
_, accuracy = model.evaluate(testX, testy, batch_size=batch_size, verbose=0)
y = model.predict(testX)
but I am getting Value error at merge_layer statement. Following is the ValueError
ValueError: zero-dimensional arrays cannot be concatenated

What you are saying can not be done using the Sequential mode of Keras.
You need to use the Model class API Guide to Keras Model.
With this API you can build the complex model you are looking for
Here you have an example of how to use it: How to Use the Keras Functional API for Deep Learning

Related

Keras 3Dconvnet time-series issue

I have a time-series of data and am running some very basic tests to get a feel for TensorFlow, Keras, Python, etc.
To setup the problem, I have a large amount of images whereby 7 images of data (with Cartesian dimensions 33 x 33) when accumulated should yield a single value. Therefore, the amount of 'x' data should be y*7 where y is the 'truth' data being trained with.
All of the training data is in entitled 'alldatax' which is a large matrix: [420420 x 33 x 33 x 7 x 1] where the dimensions are the total number of single images, x-dimension, y-dimension, number of images to be accumulated for a single 'truth' value, and then a final dimension necessary for 3D convolving.
The 'truth' matrix, alldatay, is a 1D matrix which is simply 420420 / 7 = 60060.
When running a simple convnet:
model = models.Sequential()
model.add(layers.InputLayer(input_shape=(33,33,7,1)))
model.add(layers.Conv3D(16,(3,3,1), activation = 'relu', input_shape = (33,33,7,1)))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.MaxPooling3D((2,2,1)))
model.add(layers.Conv3D(32,(3,3,1), activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.MaxPooling3D((2,2,1)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(32, activation = 'relu'))
model.add(layers.LeakyReLU(alpha=0.3))
model.add(layers.Dropout(0.5))
model.add(layers.Dense(1, activation = 'relu'))
model.compile(optimizer = 'adam', loss = 'mse')
model.fit(x = alldatax, y = alldatay, batch_size = 1000, epochs = 50, verbose = 1, shuffle = False)
I get an error: ValueError: Input arrays should have the same number of samples as target arrays. Found 420420 input samples and 60060 target samples.
What needs to change to get the convnet to realize it needs 7*x for every y value?
Something seems to be wrong in your calculations.
You state that the neural net should take seven 33x33 images as one input example, so you set the input shape of the first layer to (33,33,7,1) which is right. This means for every 33x33x7x1 input there should be exactly one y value.
Since all of your data all your data comprises 420420 33x33x7x1 images there should be 420420 y values, not 60060.

Is it possible to train using same model with two inputs?

Hello I have a some question for keras.
currently i want implement some network
using same cnn model, and use two images as input of cnn model
and use two result of cnn model, provide to Dense model
for example
def cnn_model():
input = Input(shape=(None, None, 3))
x = Conv2D(8, (3, 3), strides=(1, 1))(input)
x = GlobalAvgPool2D()(x)
model = Model(input, x)
return model
def fc_model(cnn1, cnn2):
input_1 = cnn1.output
input_2 = cnn2.output
input = concatenate([input_1, input_2])
x = Dense(1, input_shape=(None, 16))(input)
x = Activation('sigmoid')(x)
model = Model([cnn1.input, cnn2.input], x)
return model
def main():
cnn1 = cnn_model()
cnn2 = cnn_model()
model = fc_model(cnn1, cnn2)
model.compile(optimizer='adam', loss='mean_squared_error')
model.fit(x=[image1, image2], y=[1.0, 1.0], batch_size=1, ecpochs=1)
i want to implement model something like this, and train models
but i got error message like below :
'All layer names should be unique'
Actually i want use only one CNN model as feature extractor and finally use two features to predict one float value as 0.0 ~ 1.0
so whole system -->>
using two images and extract features from same CNN model, and features are provided to Dense model to get one floating value
Please, help me implement this system and how to train..
Thank you
See the section of the Keras documentation on shared layers:
https://keras.io/getting-started/functional-api-guide/
A code snippet from the documentation above demonstrating this:
# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)

Keras LSTM layers input shape

I am trying to feed a sequence with 20 featuresto an LSTM network as shown in the code. But I get an error that my Input0 is incompatible with LSTM input. Not sure how to change my layer structure to fit the data.
def build_model(features, aux1=None, aux2=None):
# create model
features[0] = np.asarray(features[0])
main_input = Input(shape=features[0].shape, dtype='float32', name='main_input')
main_out = LSTM(40, activation='relu')
aux1_input = Input(shape=(len(aux1[0]),), dtype='float32', name='aux1_input')
aux1_out = Dense(len(aux1[0]))(aux1_input)
aux2_input = Input(shape=(len(aux2[0]),), dtype='float32', name='aux2_input')
aux2_out = Dense(len(aux2[0]))(aux2_input)
x = concatenate([aux1_out, main_out, aux2_out])
x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
output = Dense(1, activation='sigmoid', name='main_output')(x)
model = Model(inputs=[aux1_input, aux2_input, main_input], outputs= [output])
return model
Features variable is an array of shape (1456, 20) I have 1456 days and for each day I have 20 variables.
Your main_input should be of shape (samples, timesteps, features)
and then you should define main_input like this:
main_input = Input(shape=(timesteps,)) # for stateless RNN (your one)
or main_input = Input(batch_shape=(batch_size, timesteps,)) for stateful RNN (not the one you are using in your example)
if your features[0] is a 1-dimensional array of various features (1 timestep), then you also have to reshape features[0] like this:
features[0] = np.reshape(features[0], (1, features[0].shape))
and then do it to features[1], features[2] etc
or better reshape all your samples at once:
features = np.reshape(features, (features.shape[0], 1, features.shape[1]))
LSTM layers are designed to work with "sequences".
You say your sequence has 20 features, but how many time steps does it have?? Do you mean 20 time steps instead?
An LSTM layer requires input shapes such as (BatchSize, TimeSteps, Features).
If it's the case that you have 1 feature in each of the 20 time steps, you must shape your data as:
inputData = someData.reshape(NumberOfSequences, 20, 1)
And the Input tensor should take this shape:
main_input = Input((20,1), ...) #yes, it ignores the batch size

How to set up Keras LSTM for time series forecasting?

I have a single training batch of 600 sequential points (x(t), y(t)) with x(t) being a 25 dimensional vector and y(t) being my target (1 dim). I would like to train an LSTM to predict how the series would continue given a few additional x(t) [t> 600]. I tried the following model:
model = Sequential()
model.add(LSTM(128, input_shape = (600,25), batch_size = 1, activation= 'tanh', return_sequences = True))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, epochs=20 ,verbose=2) prediction
prediction = model.predict(testX, batch_size = 1)
Fitting works fine, but I keep getting the following error at the prediction step:
Error when checking : expected lstm_46_input to have shape (1, 600, 25) but got array with shape (1, 10, 25)
What am I missing?
Here are my shapes:
trainX.shape = (1,600,25)
trainY.shape = (1,600,1)
testX.shape = (1,10,25)
According to Keras documentation input of LSTM (or any RNN) layers should be of shape (batch_size, timesteps, input_dim) where your input shape is
trainX.shape = (1,600,25)
So it means for training you are passing only one data with 600 timesteps and 25 features per timestep. But I got a feeling that you actually have 600 training data each having 25 timesteps and 1 feature per timestep. I guess your input shape (trainX) should be 600 x 25 x 1. Train target (trainY) should be 600 x 1 If my assumption is right then your test data should be of shape 10 x 25 x 1. First LSTM layer should be written as
model.add(LSTM(128, input_shape = (25,1), batch_size = 1, activation= 'tanh', return_sequences = False))
If your training data is in fact (1,600,25) what this means is you are unrolling the LSTM feedback 600 times. The first input has an impact on the 600th input. If this is what you want, you can use the Keras function "pad_sequences" to add append zeros to the test matrix so it has the shape (1,600,25). The network should predict zeros and you will need to add 590 zeros to your testY.
If you only want say 10 previous timesteps to affect your current Y prediction, then you will want to turn your trainX into shape (590,10,25). The input line will be something like:
model.add(LSTM(n_hid, stateful=True, return_sequences=False, batch_input_shape=(1,nTS,x_train.shape[2])))
The processing to get it in the form you want could be something like this:
def formatTS(XX, yy, window_length):
x_train = np.zeros((XX.shape[0]-window_length,window_length,XX.shape[1]))
for i in range(x_train.shape[0]):
x_train[i] = XX[i:i+window_length,:]
y_train = yy[window_length:]
return x_train, y_train
Then your testing will work just fine since it is already in the shape (1,10,25).

LSTM with Keras for mini-batch training and online testing

I would like to implement an LSTM in Keras for streaming time-series prediction -- i.e., running online, getting one data point at a time. This is explained well here, but as one would assume, the training time for an online LSTM can be prohibitively slow. I would like to train my network on mini-batches, and test (run prediction) online. What is the best way to do this in Keras?
For example, a mini-batch could be a sequence of 1000 data values ([33, 34, 42, 33, 32, 33, 36, ... 24, 23]) that occur at consecutive time steps. To train the network I've specified an array X of shape (900, 100, 1), where there are 900 sequences of length 100, and an array y of shape (900, 1). E.g.,
X[0] = [[33], [34], [42], [33], ...]]
X[1] = [[34], [42], [33], [32], ...]]
...
X[999] = [..., [24]]
y[999] = [23]
So for each sequence X[i], there is a corresponding y[i] that represents the next value in the time-series -- what we want to predict.
In test I want to predict the next data values 1000 to 1999. I do this by feeding an array of shape (1, 100, 1) for each step from 1000 to 1999, where the model tries to predict the value at the next step.
Is this the recommended approach and setup for my problem? Enabling statefulness may be the way to go for a purely online implementation, but in Keras this requires a consistent batch_input_shape in training and testing, which would not work for my intent of training on mini-batches and then testing online. Or is there a way I can do this?
UPDATE: Trying to implement the network as #nemo recommended
I ran my own dataset on an example network from a blog post "Time Series Prediction with LSTM Recurrent Neural Networks in Python with Keras", and then tried implementing the prediction phase as a stateful network.
The model building and training is the same for both:
# Create and fit the LSTM network
numberOfEpochs = 10
look_back = 30
model = Sequential()
model.add(LSTM(4, input_dim=1, input_length=look_back))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(trainX, trainY, nb_epoch=numberOfEpochs, batch_size=1, verbose=2)
# trainX.shape = (6883, 30, 1)
# trainY.shape = (6883,)
# testX.shape = (3375, 30, 1)
# testY.shape = (3375,)
Batch prediction is done with:
trainPredict = model.predict(trainX, batch_size=batch_size)
testPredict = model.predict(testX, batch_size=batch_size)
To try a stateful prediction phase, I ran the same model setup and training as before, but then the following:
w = model.get_weights()
batch_size = 1
model = Sequential()
model.add(LSTM(4, batch_input_shape=(batch_size, look_back, 1), stateful=True))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
trainPredictions, testPredictions = [], []
for trainSample in trainX:
trainPredictions.append(model.predict(trainSample.reshape((1,look_back,1)), batch_size=batch_size))
trainPredict = numpy.concatenate(trainPredictions).ravel()
for testSample in testX:
testPredictions.append(model.predict(testSample.reshape((1,look_back,1)), batch_size=batch_size))
testPredict = numpy.concatenate(testPredictions).ravel()
To inspect the results, the plots below show the actual (normalized) data in blue, the predictions on the training set in green, and the predictions on the test set in red.
The first figure is from using batch prediction, and the second from stateful. Any ideas what I'm doing incorrectly?
If I understand you correctly you are asking if you can enable statefulness after training. This should be possible, yes. For example:
net = Dense(1)(SimpleRNN(stateful=False)(input))
model = Model(input=input, output=net)
model.fit(...)
w = model.get_weights()
net = Dense(1)(SimpleRNN(stateful=True)(input))
model = Model(input=input, output=net)
model.set_weights(w)
After that you can predict in a stateful way.

Resources