How to get the output of intermediate layers which are not connected via Sequential() function? - keras

I am new in Keras, but I worked with pure tensorflow before. I am trying to debug some of the following network (I will just copy a fragment. Loss function, optimizer, etc are unimportant to me for with this code)
#Block 1 (Conv,relu,batch) starts with 800 x 400
main_input = LNN.Input(shape=((800,400,5)),name='main_input')
enc_conv1 = LNN.Convolution2D(8,3,padding='same',activation='relu')(main_input)
enc_bn1 = LNN.BatchNormalization(axis=1)(enc_conv1)
#Block 2 (Conv,relu,batch) starts with 400 x 200
maxp1_4 = LNN.MaxPooling2D(strides=2)(enc_bn1)
enc_conv2 = LNN.Convolution2D(16,3,padding='same',activation='relu')(maxp1_4)
enc_bn2 = LNN.BatchNormalization(axis=1)(enc_conv2)
enc_conv3 = LNN.Convolution2D(16,3,padding='same',activation='relu')(enc_bn2)
enc_bn3 = LNN.BatchNormalization(axis=1)(enc_conv3)
concat1_5 = LNN.concatenate(axis=3,inputs=[enc_bn3,maxp1_4])
I have seen some examples of how to do it adding each operation to a Sequential() function (for example as the one explained here but with the add() function. Is there a way to check the output of each layer without adding them to a model itself (as it can be also done with Tensorflow, making a session)?

The best is to make a model that outputs those layers:
modelToOutputAll = Model(main_input, [enc_conv1, enc_bn1, maxp1_4, enc_conv2, enc_bn2, enc_conv3, enc_bn3, concat1_5])
For training, keep a model with only the final output:
modelForTraining = Model(main_input,concat1_5)
Both models are using the exact same weights, so training one changes the other. You use each one for doing what you need at the moment.
Train with modelForTraining.fit(xTrain,yTrain, ...)
See intermediate layers with modelToOutputAll.predict(xInput)

Related

Keras fitting setting in TensorFlow Extended (TFX)

I try to construct a TFX pipeline with a trainer component with a Keras model defined like this:
def run_fn(fn_args: components.FnArgs):
transform_output = TFTransformOutput(fn_args.transform_output)
train_dataset = input_fn(fn_args.train_files,
fn_args.data_accessor,
transform_output,
num_batch)
eval_dataset = input_fn(fn_args.eval_files,
fn_args.data_accessor,
transform_output,
num_batch)
history = model.fit(train_dataset,
epochs=num_epochs,
steps_per_epoch=fn_args.train_steps,
validation_data=eval_dataset,
validation_steps=fn_args.eval_steps)
This works. However, if I change fitting to the following, this doesn't work:
history = model.fit(train_dataset,
epochs=num_epochs,
batch_size=num_batch,
validation_split=0.1)
Now, I have two questions:
Why does fitting work only with steps_per_epochs only? I couldn't find any explicit statement supporting this but this is the only way. Somehow I conclude that it must be something TFX specific (TFX handles input data only in a generator-like way?).
Let's say my train_dataset contains 100 instances and steps_per_epoch=1000 (with epochs=1). Is that mean that my 100 input instances are feed 10x each in order to reach the defined 1000 step? Isn't that counter-productive from training perspective?

Keras. How to concatenate intermediate layers of two different models into a third model

I have two sequential models that both do a pretty good job of classifying audio. One uses mfccs and the other wave forms. I am now trying to combine them into a third functional API model using one of the later Dense layers from each of the mfcc and wave form models. The example about how to get the intermediate layers in the Keras FAQ is not working for me (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer).
Here is my code:
mfcc_model = load_model(S01_model_local_loc)
waveform_model = load_model(T01_model_local_loc)
mfcc_input = Input(shape=(79,30,1))
mfcc_model_as_layer = Model(inputs=mfcc_model.input,
outputs=mfcc_model.get_layer(name = 'dense_9').output)
waveform_input = Input(shape=(40000,1))
waveform_model_as_layer = Model(inputs=waveform_model.input,
outputs=waveform_model.get_layer(name = 'dense_2').output)
concatenated_1024 = concatenate([mfcc_model_as_layer, waveform_model_as_layer])
model_pred = layers.Dense(2, activation='sigmoid')(concatenated_1024)
uber_model = Model(inputs=[mfcc_input,waveform_input], outputs=model_pred)
This throws the error:
AttributeError: Layer sequential_5 has multiple inbound nodes, hence the notion of "layer input" is ill-defined. Use get_input_at(node_index) instead.
Changing the inputs to the first two Model statements to inputs=mfcc_model.get_input_at(1) and inputs=waveform_model.get_input_at(1) solves that error message, but I then get this error message:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("dropout_21_input:0", shape=(?, 79, 30, 1), dtype=float32) at layer "dropout_21_input". The following previous layers were accessed without issue: []
If I remove the .get_layer statements and just take the final output of the model the graph connects nicely.
What do I need to do to just get the output of the Dense layers that I want?
Update: I found a really hacky way of getting what I want. I pop'ed off the layers of the mfcc and wave form models until the output layers were what I wanted. Then the code below seems to work. I'd love to know the right way to do this!
mfcc_input = Input(shape=(79,30,1))
waveform_input = Input(shape=(40000,1))
mfcc_model_as_layer = mfcc_model(mfcc_input)
waveform_model_as_layer = waveform_model(waveform_input)
concatenated_1024 = concatenate([mfcc_model_as_layer, waveform_model_as_layer])
model_pred = layers.Dense(2, activation='sigmoid')(concatenated_1024)
test_model = Model(inputs=[mfcc_input,waveform_input], outputs=model_pred)

keras model not showing all layers after wrapping

I have a cnn model (called cnn_model). I wrap the model by time distributed to work on sequences. The new model is called lstm_model. why can't I see the cnn layers inside lstm_model?
The code:
cnn_model = getModel(input_shape=(imageH, imageW), CHANNELS)
image_frames = Input(batch_shape=(BATCH_SIZE, TIME_STEPS, imageH, imageW, CHANNELS))
encoded_images = TimeDistributed(cnn_model)(image_frames)
x = LSTM(output_dim=256, return_sequences=True)(encoded_images)
outputs = TimeDistributed(Dense(NUM_EVENTS, activation="sigmoid"))(x)
lstm_model = Model([image_frames], outputs)
lstm_model.summary() show only 5 layers, without all the cnn_model layers in it.
On the other hand - number of parameters indicate that the layers are indeed inside the new model. (500k parameters in lstm layers, 2.5 million parameters from cnn model. total of 3 million parameters in lstm_model)
help anyone?
Found them.
They are inside model.layers[1].layer.layers
After changing the layers (for me- unfreezing them), had to recompile the model.
Now it works

Using Keras like TensorFlow for gpu computing

I would like to know if Keras can be used as an interface to TensoFlow for only doing computation on my GPU.
I tested TF directly on my GPU. But for ML purposes, I started using Keras, including the backend. I would find it 'comfortable' to do all my stuff in Keras instead of Using two tools.
This is also a matter of curiosity.
I found some examples like this one:
http://christopher5106.github.io/deep/learning/2018/10/28/understand-batch-matrix-multiplication.html
However this example does not actually do the calculation.
It also does not get input data.
I duplicate the snippet here:
'''
from keras import backend as K
a = K.ones((3,4))
b = K.ones((4,5))
c = K.dot(a, b)
print(c.shape)
'''
I would simply like to know if I can get the result numbers from this snippet above, and how?
Thanks,
Michel
Keras doesn't have an eager mode like Tensorflow, and it depends on models or functions with "placeholders" to receive and output data.
So, it's a little more complicated than Tensorflow to do basic calculations like this.
So, the most user friendly solution would be creating a dummy model with one Lambda layer. (And be careful with the first dimension that Keras will insist to understand as a batch dimension and require that input and output have the same batch size)
def your_function_here(inputs):
#if you have more than one tensor for the inputs, it's a list:
input1, input2, input3 = inputs
#if you don't have a batch, you should probably have a first dimension = 1 and get
input1 = input1[0]
#do your calculations here
#if you used the batch_size=1 workaround as above, add this dimension again:
output = K.expand_dims(output,0)
return output
Create your model:
inputs = Input(input_shape)
#maybe inputs2 ....
outputs = Lambda(your_function_here)(list_of_inputs)
#maybe outputs2
model = Model(inputs, outputs)
And use it to predict the result:
print(model.predict(input_data))

How to use hidden layer activations to construct loss function and provide y_true during fitting in Keras?

Assume I have a model like this. M1 and M2 are two layers linking left and right sides of the model.
The example model: Red lines indicate backprop directions
During training, I hope M1 can learn a mapping from L2_left activation to L2_right activation. Similarly, M2 can learn a mapping from L3_right activation to L3_left activation.
The model also needs to learn the relationship between two inputs and the output.
Therefore, I should have three loss functions for M1, M2, and L3_left respectively.
I probably can use:
model.compile(optimizer='rmsprop',
loss={'M1': 'mean_squared_error',
'M2': 'mean_squared_error',
'L3_left': mean_squared_error'})
But during training, we need to provide y_true, for example:
model.fit([input_1,input_2], y_true)
In this case, the y_true is the hidden layer activations and not from a dataset.
Is it possible to build this model and train it using it's hidden layer activations?
If you have only one output, you must have only one loss function.
If you want three loss functions, you must have three outputs, and, of course, three Y vectors for training.
If you want loss functions in the middle of the model, you must take outputs from those layers.
Creating the graph of your model: (if the model is already defined, see the end of this answer)
#Here, all "SomeLayer(blabla)" could be replaced by a "SomeModel" if necessary
#Example of using a layer or a model:
#M1 = SomeLayer(blablabla)(L12)
#M1 = SomeModel(L12)
from keras.models import Model
from keras.layers import *
inLef = Input((shape1))
inRig = Input((shape2))
L1Lef = SomeLayer(blabla)(inLef)
L2Lef = SomeLayer(blabla)(L1Lef)
M1 = SomeLayer(blablaa)(L2Lef) #this is an output
L1Rig = SomeLayer(balbla)(inRig)
conc2Rig = Concatenate(axis=?)([L1Rig,M1]) #Or Add, or Multiply, however you're joining the models
L2Rig = SomeLayer(nlanlab)(conc2Rig)
L3Rig = SomeLayer(najaljd)(L2Rig)
M2 = SomeLayer(babkaa)(L3Rig) #this is an output
conc3Lef = Concatenate(axis=?)([L2Lef,M2])
L3Lef = SomeLayer(blabla)(conc3Lef) #this is an output
Creating your model with three outputs:
Now you've got your graph ready and you know what the outputs are, you create the model:
model = Model([inLef,inRig], [M1,M2,L3Lef])
model.compile(loss='mse', optimizer='rmsprop')
If you want different losses for each output, then you create a list:
#example of custom loss function, if necessary
def lossM1(yTrue,yPred):
return keras.backend.sum(keras.backend.abs(yTrue-yPred))
#compiling with three different loss functions
model.compile(loss = [lossM1, 'mse','binary_crossentropy'], optimizer =??)
But you've got to have three different yTraining too, for training with:
model.fit([input_1,input_2], [yTrainM1,yTrainM2,y_true], ....)
If your model is already defined and you don't create it's graph like I did:
Then, you have to find in yourModel.layers[i] which ones are M1 and M2, so you create a new model like this:
M1 = yourModel.layers[indexForM1].output
M2 = yourModel.layers[indexForM2].output
newModel = Model([inLef,inRig], [M1,M2,yourModel.output])
If you want that two outputs be equal:
In this case, just subtract the two outputs in a lambda layer, and make that lambda layer be an output of your model, with expected values = 0.
Using the exact same vars as before, we'll just create two addictional layers to subtract outputs:
diffM1L1Rig = Lambda(lambda x: x[0] - x[1])([L1Rig,M1])
diffM2L2Lef = Lambda(lambda x: x[0] - x[1])([L2Lef,M2])
Now your model should be:
newModel = Model([inLef,inRig],[diffM1L1Rig,diffM2L2lef,L3Lef])
And training will expect those two differences to be zero:
yM1 = np.zeros((shapeOfM1Output))
yM2 = np.zeros((shapeOfM2Output))
newModel.fit([input_1,input_2], [yM1,yM2,t_true], ...)
Trying to answer to the last part: how to make gradients only affect one side of the model.
...well.... at first that sounds unfeasible to me. But, if that is similar to "train only a part of the model", then it's totally ok by defining models that only go to a certain point and making part of the layers untrainable.
By doing that, nothing will affect those layers. If that's what you want, then you can do it:
#using the previous vars to define other models
modelM1 = Model([inLef,inRig],diffM1L1Rig)
This model above ends in diffM1L1Rig. Before compiling, you must set L2Right untrainable:
modelM1.layers[??].trainable = False
#to find which layer is the right one, you may define then using the "name" parameter, or see in the modelM1.summary() the shapes, types etc.
modelM1.compile(.....)
modelM1.fit([input_1, input_2], yM1)
This suggestion makes you train only a single part of the model. You can repeat the procedure for M2, locking the layers you need before compiling.
You can also define a full model taking all layers, and lock only the ones you want. But you won't be able (I think) to make half gradients pass by one side and half the gradients pass by the other side.
So I suggest you keep three models, the fullModel, the modelM1, and the modelM2, and you cycle them in training. One epoch each, maybe....
That should be tested....

Resources