Shared dropout layer on input - keras

I want to apply same dropout to two input tensors of same shape. One way to do that is to join the inputs, apply dropout and then split the tensors again. This way same features will get dropped from each input in each iteration.
The code seems to work fine and model is training. Can anyone confirm that what i have done is doing what i am expecting? I don't know of a way to compare tensors or else i could just change dropout to 0 and compare the output with input.
#input1 (10, 6) input2 (10,6)
input_list = [input1, input2]
#Join inputs and form (20x6) tensor
input_concat = keras.layers.concatenate(input_list, axis=1)
input_dropout = Dropout(0.5)(input_concat)
reshaped_input = keras.layers.Reshape((10, 6, 2))(input_dropout)
input_1 = keras.layers.Lambda(lambda x:x[:,:,:,0])(reshaped_input)
input_2 = keras.layers.Lambda(lambda x:x[:,:,:,1])(reshaped_input)

Related

Input shape for 1D convolution network in keras

I am quite new to keras and I have a problem in understanding shapes.
I wanted to create 1D Conv Keras model as follows, I don't know this is correct or not:
TIME_PERIODS = 511
num_sensors = 2
num_classes = 4
BATCH_SIZE = 400
EPOCHS = 50
model_m = Sequential()
model_m.add(Conv1D(100, 10, activation='relu', input_shape=(TIME_PERIODS, num_sensors)))
model_m.add(Conv1D(100, 10, activation='relu'))
model_m.add(MaxPooling1D(3))
model_m.add(Conv1D(160, 10, activation='relu'))
model_m.add(Conv1D(160, 10, activation='relu'))
model_m.add(GlobalAveragePooling1D())
model_m.add(Dropout(0.5))
model_m.add(Dense(num_classes, activation='softmax'))
The input data I have is 888 different panda data frame where each frame is of shape (511, 3) where 511 is numbers of signal points and 0th column is sensor1 values, 1st column is sensor2 values and 2nd column is labelled signals.
Now how I should combine all my 888 different panda data frame so I have x_train and y_train from X and Y using Sklearn train_test_split.
Also, I think the input shape I am defining for the model is wrong and I don't think I actually have TIME_PERIODS because, for 1-time point, I have 2 sensor inputs (orange, blue line) value and 1 output label (green line).
The context of the problem I am trying to solve e.g.
input: time-based 2 sensors values say for 1 AM-2 AM hour from a user, output: the range of times e.g where the user was doing activity 1, activity 2, activity X on 1:10-1:15, 1:15-1:30, 1:30-2:00, The above plot show a sample training input and output.
The problem is inspired from here but in my case, I don't have any time period, my 1-time point has 1 output label.
Update 1:
I am almost certain that my TIME_PERIODS=1 as for the prediction I will give 511 inputs and expects to get 511 output values.
Each dataframe is an independent sequence?
fileNames = get a list of filenames here, you can maybe os.listdir for that
allFrames = [pandas.read_csv(filename,... other_things...).values for filename in fileNames]
allData = np.stack(allFrames, axis=0)
inputData = allData[:,:num_sensors]
outputData = allData[:, -1:]
You can now use train test split the way you want.
Your input shape is correct.
If you want to predict the whole sequence, then you have to remove the poolings. Every convolution should use padding='same'.
And maybe you should use a Biridectional(LSTM(units, return_sequences=True)) layer somewhere to make your model stronger.
A simple model as an example. (Notice that models are totally open to creativity)
from keras.layers import *
inputs = Input((TIME_PERIODS,num_sensors)) #Should be called "time_steps" to be precise
outputs = Conv1D(any, 3, padding='same', activation = 'tanh')(inputs)
outputs = Bidirectional(LSTM(any, return_sequences=True))(outputs)
outputs = Conv1D(num_classes, activation='softmax', padding='same')(outputs)
model = keras.models.Model(inputs, outputs)
To say the least, you're in the correct path. The full solution for this would be like,
df = pd.concat([pd.read_csv(fname, index_col=<int>, header=<int>) for f filenames], ignore_index=True, axis=0)
inputs = df.loc[:,:-1]
labels = df.loc[:,0]
X_train, X_test, y_train, y_test = train_test_split(inputs, labels, test_size=<float>)
To add a bit more information, note how you are doing,
model_m.add(Conv1D(100, 10, activation='relu', input_shape=(TIME_PERIODS, num_sensors)))
and not
model_m.add(Conv1D(100, 10, activation='relu', padding='SAME', input_shape=(TIME_PERIODS, num_sensors)))
So, as you're not setting padding="Same" for the convolution layers this might have the undesirable effect of input becoming smaller and smaller as you go deeper to the model. If that's what you need, that's okay. Otherwise, set `padding="SAME".
For example, without same-padding you'll get, a width around 144 when you get to the GlobalPooling layer, where if you use same-padding it would be roughly 170. It's not a major problem here, but can easily lead to negative sizes in your input for deeper layers.

Debug output of keras layers during training

When fitting a model using keras, I encounter nans, and I want to debug the output of each layer.
The code has an input in1 which goes through multiple layers, and during the final layer I multiply elementwise with another input in2 and then do the prediction. The input in2 is sparse and is used for masking (a row resembles something like this [0 0 0 1 0 0 1 0 1 0... 0]). Label matrix contains one-hot-encoded rows. Input in1 is a vector of real values.
in1 = Input(shape=(27,), name='in1')
in2 = Input(shape=(1000,), name='in2')
# Hidden layers
hidden_1 = Dense(1024, activation='relu')(in1)
hidden_2 = Dense(512, activation='relu')(hidden_1)
hidden_3 = Dense(256, activation='relu')(hidden_2)
hidden_4 = Dense(10, activation='linear')(hidden_3)
final = Dense(1000, activation='linear')(hidden_4)
# Ensure we do not overflow when we exponentiate
final2 = Lambda(lambda x: x - K.max(x))(final)
#Masked soft-max using Lambda and merge-multiplication
exponentiate = Lambda(lambda x: K.exp(x))(final2)
masked = Multiply()([exponentiate, in2])
predicted = Lambda(lambda x: x / K.sum(x))(masked)
# Compile with categorical crossentropy and adam
mdl = Model(inputs=[in1, in2],outputs=predicted)
mdl.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
tensorboard = TensorBoard(log_dir="/Users/somepath/tmp/{}".format(time()), write_graph=True,
write_grads=True)
mdl.fit({'in1': in1_matrix, 'in2': in2_matrix},
label_matrix, epochs=1, batch_size=32, verbose=2, callbacks=[tensorboard])
I want to print the output of each layer, gradients during training and how to send auxiliary input (in2) while debugging.
I have tried to print the output of each layer like below, which works until layer7:
get_layer_output = K.function([mdl.layers[0].input],[mdl.layers[7].output])
layer_output = get_layer_output([in1_matrix])
But when I get to layer 8, I'm unable to add in2_matrix. I get the following error when I use the following code to print.
get_layer_output2 = K.function([mdl.layers[0].input],[mdl.layers[8].output])
layer_output2 = get_layer_output2([in1_matrix])
Error:
InvalidArgumentError: You must feed value for placeholder tensor 'in2' with dtype float and shape [?,1000]
I don't know how to provide in2 in K.function, and also in2_matrix to get_layer_output2.
(I have checked the in1_matrix, in2_matrix, and the label_matrix. They all look fine, with no nans or inf. Label array has no rows or columns with all zeros.)'
I'm new to Keras, any idea on how to debug nans, with callbacks even to print gradients would be appreciated. Please also let me know if there is anything wrong with the way the layers are composed.
If you print out mdl.layers[8], you can find it is Input layer, I guess you want to get the output of mdl.layers[9], which is Multiply layer. You can get like this,
get_layer_output2 = K.function([mdl.layers[0].input, mdl.layers[8].input],[mdl.layers[9].output])
layer_output2 = get_layer_output2([in1_matrix, in2_matrix])

create Dense layers in a loop

I need to create multiple dense layers by for loop, the number of iteration depends on the number of labels. I want to create one dense layer for each label. Each label has a different set of features, so I want to predict each label separately with corresponding feature set in each dense layer. Is that possible? The following code is my attempt.
layers = []
for i in range(num_labels):
h1 = Dense(num_genes_per+10, kernel_initializer='normal', input_dim = num_genes_per, activation='relu')(inputs)
h2 = Dense(int(num_genes_per/2), kernel_initializer='normal', activation='relu')(h1)
output= Dense(1, kernel_initializer='normal', activation='linear')(h2)
layers.append(output)
merged_output = concatenate(layers, axis=1)
model = Model(inputs, merged_output)
The output of each h2 will have shape [batch, 1], and the merged_output will have shape [batch, num_labels]. Is there any error in the above code?
I know it is not efficient, but if I concatenate the different set of features into one input tensor, and use only one dense layer to predict all labels at same time, would it harms the prediction accuracy?
It depends on how you defined features and labels. If features 1, 2 and 3 are used to predict label 1 and they have no relation to label 2, It does not make sense to include it in label 3 inference.

How to send multiple vectors to a SimpleRNN?

I have 'n' no. of vectors of 'm' size each. I need to send them to a SimpleRNN of keras. The vectors should be send such that each neuron of the RNN takes a vector(eg: vector1 to neuron1, vector2 t neuron2 etc) along with hidden state of previous input vector.
I have tried concatenating them, but this distorts the nature of the input.
input1 = Dense(20, activation = "relu")(input1)
input2 = Dense(20, activation = "relu")(input2)
I need to send these vectors(input1 and input2) to the RNN.
You can use tf.stack in Tensorflow or keras.backend.stack in Keras. This operator:
Stacks a list of rank-R tensors into one rank-(R+1) tensor
Based on your code, Dense layers can be stacked in the following way:
import tensorflow as tf
inps1 = tf.keras.layers.Input(shape=(30,))
inps2 = tf.keras.layers.Input(shape=(30,))
dense1 = tf.keras.layers.Dense(20, activation='relu')(inps1)
dense2 = tf.keras.layers.Dense(20, activation='relu')(inps2)
dense = tf.keras.layers.Lambda(lambda x: tf.stack([x[0], x[1]], axis=1), output_shape=(None, 2, 20))([dense1, dense2])
rnn = tf.keras.layers.SimpleRNN(100)(dense)

Resnet with Custom Data

I am trying to modify Resnet50 with my custom data as follows:
X = [[1.85, 0.460,... -0.606] ... [0.229, 0.543,... 1.342]]
y = [2, 4, 0, ... 4, 2, 2]
X is a feature vector of length 2000 for 784 images. y is an array of size 784 containing the binary representation of labels.
Here is the code:
def __classifyRenet(self, X, y):
image_input = Input(shape=(2000,1))
num_classes = 5
model = ResNet50(weights='imagenet',include_top=False)
model.summary()
last_layer = model.output
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(last_layer)
# add fully-connected & dropout layers
x = Dense(512, activation='relu',name='fc-1')(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu',name='fc-2')(x)
x = Dropout(0.5)(x)
# a softmax layer for 5 classes
out = Dense(num_classes, activation='softmax',name='output_layer')(x)
# this is the model we will train
custom_resnet_model2 = Model(inputs=model.input, outputs=out)
custom_resnet_model2.summary()
for layer in custom_resnet_model2.layers[:-6]:
layer.trainable = False
custom_resnet_model2.layers[-1].trainable
custom_resnet_model2.compile(loss='categorical_crossentropy',
optimizer='adam',metrics=['accuracy'])
clf = custom_resnet_model2.fit(X, y,
batch_size=32, epochs=32, verbose=1,
validation_data=(X, y))
return clf
I am calling to function as:
clf = self.__classifyRenet(X_train, y_train)
It is giving an error:
ValueError: Error when checking input: expected input_24 to have 4 dimensions, but got array with shape (785, 2000)
Please help. Thank you!
1. First, understand the error.
Your input does not match the input of ResNet, for ResNet, the input should be (n_sample, 224, 224, 3) but you are having (785, 2000). From your question, you have 784 images with array of size 2000, which doesn't really align with the original ResNet50 input shape of (224 x 224) no matter how you reshape it. That means you cannot use the ResNet50 directly with your data. The only thing you did in your code is to take the last layer of ResNet50 and added you output layer to align with your output class size.
2. Then, what you can do.
If you insist to use the ResNet architecture, you will need to change the input layer rather than output layer. Also, you will need to reshape your image data to utilize the convolution layers. That means, you cannot have it in a (2000,) array, but need to be something like (height, width, channel), just like what ResNet and other architectures are doing. Of course you will also need to change the output layer as well just like you did so that you are predicting for your classes. Try something like:
model = ResNet50(input_tensor=image_input_shape, include_top=True,weights='imagenet')
This way, you can specify customized input image shape. You can check the github code for more information (https://github.com/keras-team/keras/blob/master/keras/applications/resnet50.py). Here's part of the docstring:
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.

Resources