Debug output of keras layers during training - python-3.x

When fitting a model using keras, I encounter nans, and I want to debug the output of each layer.
The code has an input in1 which goes through multiple layers, and during the final layer I multiply elementwise with another input in2 and then do the prediction. The input in2 is sparse and is used for masking (a row resembles something like this [0 0 0 1 0 0 1 0 1 0... 0]). Label matrix contains one-hot-encoded rows. Input in1 is a vector of real values.
in1 = Input(shape=(27,), name='in1')
in2 = Input(shape=(1000,), name='in2')
# Hidden layers
hidden_1 = Dense(1024, activation='relu')(in1)
hidden_2 = Dense(512, activation='relu')(hidden_1)
hidden_3 = Dense(256, activation='relu')(hidden_2)
hidden_4 = Dense(10, activation='linear')(hidden_3)
final = Dense(1000, activation='linear')(hidden_4)
# Ensure we do not overflow when we exponentiate
final2 = Lambda(lambda x: x - K.max(x))(final)
#Masked soft-max using Lambda and merge-multiplication
exponentiate = Lambda(lambda x: K.exp(x))(final2)
masked = Multiply()([exponentiate, in2])
predicted = Lambda(lambda x: x / K.sum(x))(masked)
# Compile with categorical crossentropy and adam
mdl = Model(inputs=[in1, in2],outputs=predicted)
mdl.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
tensorboard = TensorBoard(log_dir="/Users/somepath/tmp/{}".format(time()), write_graph=True,
write_grads=True)
mdl.fit({'in1': in1_matrix, 'in2': in2_matrix},
label_matrix, epochs=1, batch_size=32, verbose=2, callbacks=[tensorboard])
I want to print the output of each layer, gradients during training and how to send auxiliary input (in2) while debugging.
I have tried to print the output of each layer like below, which works until layer7:
get_layer_output = K.function([mdl.layers[0].input],[mdl.layers[7].output])
layer_output = get_layer_output([in1_matrix])
But when I get to layer 8, I'm unable to add in2_matrix. I get the following error when I use the following code to print.
get_layer_output2 = K.function([mdl.layers[0].input],[mdl.layers[8].output])
layer_output2 = get_layer_output2([in1_matrix])
Error:
InvalidArgumentError: You must feed value for placeholder tensor 'in2' with dtype float and shape [?,1000]
I don't know how to provide in2 in K.function, and also in2_matrix to get_layer_output2.
(I have checked the in1_matrix, in2_matrix, and the label_matrix. They all look fine, with no nans or inf. Label array has no rows or columns with all zeros.)'
I'm new to Keras, any idea on how to debug nans, with callbacks even to print gradients would be appreciated. Please also let me know if there is anything wrong with the way the layers are composed.

If you print out mdl.layers[8], you can find it is Input layer, I guess you want to get the output of mdl.layers[9], which is Multiply layer. You can get like this,
get_layer_output2 = K.function([mdl.layers[0].input, mdl.layers[8].input],[mdl.layers[9].output])
layer_output2 = get_layer_output2([in1_matrix, in2_matrix])

Related

How to combine two pretrained models in keras?

I would like to combine two pretrained models(DenseNet169 and InceptionV3) or it could be any two. Followed the steps from the following link, but did not work. Did try both concatenate and Concatenate, still getting error. I might have made some mistakes somewhere. This is my first stackoverflow question and help would be greatly appreciated.
https://datascience.stackexchange.com/questions/39407/how-to-make-two-parallel-convolutional-neural-networks-in-keras
First case: I tried with NO pooling
model1 = DenseNet169(weights='imagenet', include_top=False, input_shape=(300,300,3))
out1 = model1.output
model2 = InceptionV3(weights='imagenet', include_top=False, input_shape=(300,300,3))
out2 = model2.output
from keras.layers import concatenate
from keras.layers import Concatenate
x = concatenate([out1, out2]) # merge the outputs of the two models
out = Dense(10, activation='softmax')(x) # final layer of the network
I got this error:
ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 9, 9, 1664), (None, 8, 8, 2048)]
Second case: tried with average pooling, able to concatenate but got error in training process
model1 = DenseNet169(weights='imagenet', include_top=False, pooling='avg', input_shape=(300,300,3))
out1 = model1.output
model2 = InceptionV3(weights='imagenet', include_top=False, pooling='avg', input_shape=(300,300,3))
out2 = model2.output
x = concatenate([out1, out2]) # merge the outputs of the two models
out = Dense(10, activation='softmax')(x) # final layer of the network
model = Model(inputs=[model1.input, model2.input], outputs=[out])
model.compile(optimizer=Adam(), loss='categorical_crossentropy',metrics=['accuracy'])
history = model.fit_generator(generator=data_generator_train,
validation_data=data_generator_val,
epochs=20,
verbose=1
)
Error in second case:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[[0.17074525, 0.10469133, 0.08226486],
[0.19852941, 0.13124999, 0.11642157],
[0.36528033, 0.3213197 , 0.3085095 ],
...,
[0.19082414, 0.17801011, 0.15840226...
Second Case: Since your model expects two inputs, your data_generator_train and data_generator_val should return/yield a list of two inputs for corresponding models and output. You can achieve that by updating the return value of __data_generation method
def __data_generation(...):
...
# consider X as input image and y as the label of your model
return [X, X], keras.utils.to_categorical(y, num_classes=self.n_classes)
First Case: Since the spatial size of the output of model2 (8x8) is dissimilar (smaller) to model1 output (9x9), you can apply zero padding on model2 output before concatenation.
out1 = model1.output
out2 = model2.output
out2 = ZeroPadding2D(((0,1), (0,1)))(out2)
x = concatenate([out1, out2])
For first case too you need to modify your data generator like second case.
Second Case Structure is True, but consider that you concatenate two models and each model has its own input if the input is similar for both of models just fit the model by repeat the input like this:
model.fit([X_train,X_train], y_train)
I myself implement your problem and it works absolutely well.
model1 = DenseNet169(weights='imagenet', include_top=False)
model2 = InceptionV3(weights='imagenet', include_top=False)
model1_out = model1.output
model1_out=GlobalAveragePooling2D()(model1_out)
model2_out = model2.output
model2_out=GlobalAveragePooling2D()(model2_out)
x = concatenate([model1_out, model2_out])
x = Dense(10, activation='softmax')(x)
model=Model(inputs=[model1.input,model2.input],outputs=x)
model.fit([X_train,X_train], y_train)

Shared dropout layer on input

I want to apply same dropout to two input tensors of same shape. One way to do that is to join the inputs, apply dropout and then split the tensors again. This way same features will get dropped from each input in each iteration.
The code seems to work fine and model is training. Can anyone confirm that what i have done is doing what i am expecting? I don't know of a way to compare tensors or else i could just change dropout to 0 and compare the output with input.
#input1 (10, 6) input2 (10,6)
input_list = [input1, input2]
#Join inputs and form (20x6) tensor
input_concat = keras.layers.concatenate(input_list, axis=1)
input_dropout = Dropout(0.5)(input_concat)
reshaped_input = keras.layers.Reshape((10, 6, 2))(input_dropout)
input_1 = keras.layers.Lambda(lambda x:x[:,:,:,0])(reshaped_input)
input_2 = keras.layers.Lambda(lambda x:x[:,:,:,1])(reshaped_input)

Resnet with Custom Data

I am trying to modify Resnet50 with my custom data as follows:
X = [[1.85, 0.460,... -0.606] ... [0.229, 0.543,... 1.342]]
y = [2, 4, 0, ... 4, 2, 2]
X is a feature vector of length 2000 for 784 images. y is an array of size 784 containing the binary representation of labels.
Here is the code:
def __classifyRenet(self, X, y):
image_input = Input(shape=(2000,1))
num_classes = 5
model = ResNet50(weights='imagenet',include_top=False)
model.summary()
last_layer = model.output
# add a global spatial average pooling layer
x = GlobalAveragePooling2D()(last_layer)
# add fully-connected & dropout layers
x = Dense(512, activation='relu',name='fc-1')(x)
x = Dropout(0.5)(x)
x = Dense(256, activation='relu',name='fc-2')(x)
x = Dropout(0.5)(x)
# a softmax layer for 5 classes
out = Dense(num_classes, activation='softmax',name='output_layer')(x)
# this is the model we will train
custom_resnet_model2 = Model(inputs=model.input, outputs=out)
custom_resnet_model2.summary()
for layer in custom_resnet_model2.layers[:-6]:
layer.trainable = False
custom_resnet_model2.layers[-1].trainable
custom_resnet_model2.compile(loss='categorical_crossentropy',
optimizer='adam',metrics=['accuracy'])
clf = custom_resnet_model2.fit(X, y,
batch_size=32, epochs=32, verbose=1,
validation_data=(X, y))
return clf
I am calling to function as:
clf = self.__classifyRenet(X_train, y_train)
It is giving an error:
ValueError: Error when checking input: expected input_24 to have 4 dimensions, but got array with shape (785, 2000)
Please help. Thank you!
1. First, understand the error.
Your input does not match the input of ResNet, for ResNet, the input should be (n_sample, 224, 224, 3) but you are having (785, 2000). From your question, you have 784 images with array of size 2000, which doesn't really align with the original ResNet50 input shape of (224 x 224) no matter how you reshape it. That means you cannot use the ResNet50 directly with your data. The only thing you did in your code is to take the last layer of ResNet50 and added you output layer to align with your output class size.
2. Then, what you can do.
If you insist to use the ResNet architecture, you will need to change the input layer rather than output layer. Also, you will need to reshape your image data to utilize the convolution layers. That means, you cannot have it in a (2000,) array, but need to be something like (height, width, channel), just like what ResNet and other architectures are doing. Of course you will also need to change the output layer as well just like you did so that you are predicting for your classes. Try something like:
model = ResNet50(input_tensor=image_input_shape, include_top=True,weights='imagenet')
This way, you can specify customized input image shape. You can check the github code for more information (https://github.com/keras-team/keras/blob/master/keras/applications/resnet50.py). Here's part of the docstring:
input_shape: optional shape tuple, only to be specified
if `include_top` is False (otherwise the input shape
has to be `(224, 224, 3)` (with `channels_last` data format)
or `(3, 224, 224)` (with `channels_first` data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 197.
E.g. `(200, 200, 3)` would be one valid value.

Keras: feed output as input at next timestep

The goal is to predict a timeseries Y of 87601 timesteps (10 years) and 9 targets. The input features X (exogenous input) are 11 timeseries of 87600 timesteps. The output has one more timestep, as this is the initial value.
The output Yt at timestep t depends on the input Xt and on the previous output Yt-1.
Hence, the model should look like this: Model layout
I could only find this thread on this: LSTM: How to feed the output back to the input? #4068.
I tried to implemented this with Keras as follows:
def build_model():
# Input layers
input_x = layers.Input(shape=(features,), name='input_x')
input_y = layers.Input(shape=(targets,), name='input_y-1')
# Merge two inputs
merge = layers.concatenate([input_x,input_y], name='merge')
# Normalise input
norm = layers.Lambda(normalise, name='scale')(merge)
# Hidden layers
x = layers.Dense(128, input_shape=(features,))(norm)
# Output layer
output = layers.Dense(targets, activation='relu', name='output')(x)
model = Model(inputs=[input_x,input_y], outputs=output)
model.compile(loss='mean_squared_error', optimizer=Adam())
return model
def make_prediction(model, X, y):
y_pred = [y[0,None,:]]
for i in range(len(X)):
y_pred.append(model.predict([X[i,None,:],y_pred[i]]))
y_pred = np.asarray(y_pred)
y_pred = y_pred.reshape(y_pred.shape[0],y_pred.shape[2])
return y_pred
# Fit
model = build_model()
model.fit([X_train, y_train[:-1]], [y_train[1:]]], epochs=200,
batch_size=24, shuffle=False)
# Predict
y_hat = make_prediction(model, X_train, y_train)
This works, but is it not what I want to achieve, as there is no connection between input and output. Hence, the model doesn't learn how to correct for an error in the fed-back output, which results in poor accuracy when predicting as the error on the output is accumulated at every timestep.
Is there a way in Keras to implement the output-input feed-back during training stage?
Also, as the initial value of Y is always known, I want to feed this to the network as well.

Restricting the output values of layers in Keras

I have defined my MLP in the code below. I want to extract the values of layer_2.
def gater(self):
dim_inputs_data = Input(shape=(self.train_dim[1],))
dim_svm_yhat = Input(shape=(3,))
layer_1 = Dense(20,
activation='sigmoid')(dim_inputs_data)
layer_2 = Dense(3, name='layer_op_2',
activation='sigmoid', use_bias=False)(layer_1)
layer_3 = Dot(1)([layer_2, dim_svm_yhat])
out_layer = Dense(1, activation='tanh')(layer_3)
model = Model(input=[dim_inputs_data, dim_svm_yhat], output=out_layer)
adam = optimizers.Adam(lr=0.01)
model.compile(loss='mse', optimizer=adam, metrics=['accuracy'])
return model
Suppose the output of layer_2 is below in matrix form
0.1 0.7 0.8
0.1 0.8 0.2
0.1 0.5 0.5
....
I would like below to be fed into layer_3 instead of above
0 0 1
0 1 0
0 1 0
Basically, I want the first maximum values to be converted to 1 and other to 0.
How can this be achieved in keras?.
Who decides the range of output values?
Output range of any layer in a neural network is decided by the activation function used for that layer. For example, if you use tanh as your activation function, your output values will be restricted to [-1,1] (and the values are continuous, check how the values get mapped from [-inf,+inf] (input on x-axis) to [-1,+1] (output on y-axis) here, understanding this step is very important)
What you should be doing is add a custom activation function that restricts your values to a step function i.e., either 1 or 0 for [-inf, +inf] and apply it to that layer.
How do I know which function to use?
You need to create y=some_function that satisfies all your needs (the input to output mapping) and convert that to Python code just like this one:
from keras import backend as K
def binaryActivationFromTanh(x, threshold) :
# convert [-inf,+inf] to [-1, 1]
# you can skip this step if your threshold is actually within [-inf, +inf]
activated_x = K.tanh(x)
binary_activated_x = activated_x > threshold
# cast the boolean array to float or int as necessary
# you shall also cast it to Keras default
# binary_activated_x = K.cast_to_floatx(binary_activated_x)
return binary_activated_x
After making your custom activation function, you can use it like
x = Input(shape=(1000,))
y = Dense(10, activation=binaryActivationFromTanh)(x)
Now test the values and see if you are getting the values like you expected. You can now throw this piece into a bigger neural network.
I strongly discourage adding new layers to add restriction to your outputs, unless it is solely for activation (like keras.layers.LeakyReLU).
Use Numpy in between. Here is an example with a random matrix:
a = np.random.random((5, 5)) # simulate random value output of your layer
result = (a == a.max(axis=1)[:,None]).astype(int)
See also this thread: Numpy: change max in each row to 1, all other numbers to 0
You than feed in result as input to your next layer.
For wrapping the Numpy calculation you could use the Lambda layer. See examples here: https://keras.io/layers/core/#lambda
Edit:
Suggestion doesn´t work. I keep answer only to keep related comments.

Resources