I did a lot of searching and am still unable to figure out writing a custom loss function with multiple outputs where they interact.
I have a Neural Network defined as :
def NeuralNetwork():
inLayer = Input((2,));
layers = [Dense(numNeuronsPerLayer,activation = 'relu')(inLayer)];
for i in range(10):
hiddenLyr = Dense(5,activation = 'tanh',name = "layer"+ str(i+1))(layers[i]);
layers.append(hiddenLyr);
out_u = Dense(1,activation = 'linear',name = "out_u")(layers[i]);
out_k = Dense(1,activation = 'linear',name = "out_k")(layers[i]);
outLayer = Concatenate(axis=-1)([out_u,out_k]);
model = Model(inputs = [inLayer], outputs = outLayer);
return model
I am now trying to define a custom loss function as follows :
def computeLoss(true,prediction):
u_pred = prediction[:,0];
k_pred = prediction[:,1];
loss = f(u_pred)*k_pred;
return loss;
Where f(u_pred) is some manipulation of u_pred. The code seems to work correct and produce correct results when I use only u_pred (i.e., single output from the neural network only). However, the moment I try to include another output for k_pred and perform the slice of my prediction tensor in the loss function, I start getting wrong results. I feel I am doing something wrong in handling multiple outputs in Keras but am not sure where my mistake lies. Any help on how I may proceed is welcome.
I figured out that you can't just use indexing ( i.e., [:,0] or [:,1] ) to slice tensors in tf. The operation doesn't seem to work. Instead, use the built in function in tensorflow as
detailed in https://www.tensorflow.org/api_docs/python/tf/split?version=stable
So the code that worked was:
(u_pred, k_pred) = tf.split(prediction, num_or_size_splits=2, axis=1)
Related
For my classification problem I want to use loss function used prettly for regression, such as Mean Absolute Error.
Consider "y_pred" and "y_true" are in one-hot-encoding, but for MAE i need them in real number representation.
In this first case I'me getting error: ValueError: No gradients provided for any variable
def AgeAccuracyRegularity(y_pred,y_true):
mae_func = tf.keras.losses.MeanAbsoluteError()
y_pred_ages = K.cast(K.argmax(y_pred, axis=-1)+1,dtype='float32')
y_true_ages = K.cast(K.argmax(y_true, axis=-1)+1,dtype='float32')
res = mae_func(y_pred_ages, y_true_ages)
return res
But if I manipulate the result with no sense in this way
def AgeAccuracyRegularity(y_pred,y_true):
mae_func = tf.keras.losses.MeanAbsoluteError()
y_pred_ages = K.cast(K.argmax(y_pred, axis=-1)+1,dtype='float32')
y_true_ages = K.cast(K.argmax(y_true, axis=-1)+1,dtype='float32')
res = mae_func(y_pred_ages, y_true_ages)
mae = mae_func(y_pred, y_true)
return res-mae+mae
it works. I check output of classificator and bot "mae" and "res" in the the custom loss function and they are the same size and type.
For classification problem you should use "sparse_categorical_crossentropy" loss function. MAE is intended to be used for regression problems.
The current Keras Captcha OCR model returns a CTC encoded output, which requires decoding after inference.
To decode this, one needs to run a decoding utility function after inference as a separate step.
preds = prediction_model.predict(batch_images)
pred_texts = decode_batch_predictions(preds)
The decoded utility function uses keras.backend.ctc_decode, which in turn uses either a greedy or beam search decoder.
# A utility function to decode the output of the network
def decode_batch_predictions(pred):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][
:, :max_length
]
# Iterate over the results and get back the text
output_text = []
for res in results:
res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
output_text.append(res)
return output_text
I would like to train a Captcha OCR model using Keras that returns the CTC decoded as an output, without requiring an additional decoding step after inference.
How would I achieve this?
The most robust way to achieve this is by adding a method which is called as part of the model definition:
def CTCDecoder():
def decoder(y_pred):
input_shape = tf.keras.backend.shape(y_pred)
input_length = tf.ones(shape=input_shape[0]) * tf.keras.backend.cast(
input_shape[1], 'float32')
unpadded = tf.keras.backend.ctc_decode(y_pred, input_length)[0][0]
unpadded_shape = tf.keras.backend.shape(unpadded)
padded = tf.pad(unpadded,
paddings=[[0, 0], [0, input_shape[1] - unpadded_shape[1]]],
constant_values=-1)
return padded
return tf.keras.layers.Lambda(decoder, name='decode')
Then defining the model as follows:
prediction_model = keras.models.Model(inputs=inputs, outputs=CTCDecoder()(model.output))
Credit goes to tulasiram58827.
This implementation supports exporting to TFLite, but only float32. Quantized (int8) TFLite export is still throwing an error, and is an open ticket with TF team.
Your question can be interpreted in two ways. One is: I want a neural network that solves a problem where the CTC decoding step is already inside what the network learned. The other one is that you want to have a Model class that does this CTC decoding inside of it, without using an external, functional function.
I don't know the answer to the first question. And I cannot even tell if it's feasible or not. In any case, sounds like a difficult theoretical problem and if you don't have luck here, you might want to try posting it in datascience.stackexchange.com, which is a more theory-oriented community.
Now, if what you are trying to solve is the second, engineering version of the problem, that's something I can help you with. The solution for that problem is the following:
You need to subclass keras.models.Model with a class with the method you want. I went over the tutorial in the link you posted and came with the following class:
class ModifiedModel(keras.models.Model):
# A utility function to decode the output of the network
def decode_batch_predictions(self, pred):
input_len = np.ones(pred.shape[0]) * pred.shape[1]
# Use greedy search. For complex tasks, you can use beam search
results = keras.backend.ctc_decode(pred, input_length=input_len, greedy=True)[0][0][
:, :max_length
]
# Iterate over the results and get back the text
output_text = []
for res in results:
res = tf.strings.reduce_join(num_to_char(res)).numpy().decode("utf-8")
output_text.append(res)
return output_text
def predict_texts(self, batch_images):
preds = self.predict(batch_images)
return self.decode_batch_predictions(preds)
You can give it the name you want, it's just for illustration purposes.
With this class defined, you would replace the line
# Get the prediction model by extracting layers till the output layer
prediction_model = keras.models.Model(
model.get_layer(name="image").input, model.get_layer(name="dense2").output
)
with
prediction_model = ModifiedModel(
model.get_layer(name="image").input, model.get_layer(name="dense2").output
)
And then you can replace the lines
preds = prediction_model.predict(batch_images)
pred_texts = decode_batch_predictions(preds)
with
pred_texts = prediction_model.predict_texts(batch_images)
I have two sequential models that both do a pretty good job of classifying audio. One uses mfccs and the other wave forms. I am now trying to combine them into a third functional API model using one of the later Dense layers from each of the mfcc and wave form models. The example about how to get the intermediate layers in the Keras FAQ is not working for me (https://keras.io/getting-started/faq/#how-can-i-obtain-the-output-of-an-intermediate-layer).
Here is my code:
mfcc_model = load_model(S01_model_local_loc)
waveform_model = load_model(T01_model_local_loc)
mfcc_input = Input(shape=(79,30,1))
mfcc_model_as_layer = Model(inputs=mfcc_model.input,
outputs=mfcc_model.get_layer(name = 'dense_9').output)
waveform_input = Input(shape=(40000,1))
waveform_model_as_layer = Model(inputs=waveform_model.input,
outputs=waveform_model.get_layer(name = 'dense_2').output)
concatenated_1024 = concatenate([mfcc_model_as_layer, waveform_model_as_layer])
model_pred = layers.Dense(2, activation='sigmoid')(concatenated_1024)
uber_model = Model(inputs=[mfcc_input,waveform_input], outputs=model_pred)
This throws the error:
AttributeError: Layer sequential_5 has multiple inbound nodes, hence the notion of "layer input" is ill-defined. Use get_input_at(node_index) instead.
Changing the inputs to the first two Model statements to inputs=mfcc_model.get_input_at(1) and inputs=waveform_model.get_input_at(1) solves that error message, but I then get this error message:
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("dropout_21_input:0", shape=(?, 79, 30, 1), dtype=float32) at layer "dropout_21_input". The following previous layers were accessed without issue: []
If I remove the .get_layer statements and just take the final output of the model the graph connects nicely.
What do I need to do to just get the output of the Dense layers that I want?
Update: I found a really hacky way of getting what I want. I pop'ed off the layers of the mfcc and wave form models until the output layers were what I wanted. Then the code below seems to work. I'd love to know the right way to do this!
mfcc_input = Input(shape=(79,30,1))
waveform_input = Input(shape=(40000,1))
mfcc_model_as_layer = mfcc_model(mfcc_input)
waveform_model_as_layer = waveform_model(waveform_input)
concatenated_1024 = concatenate([mfcc_model_as_layer, waveform_model_as_layer])
model_pred = layers.Dense(2, activation='sigmoid')(concatenated_1024)
test_model = Model(inputs=[mfcc_input,waveform_input], outputs=model_pred)
I would like to know if Keras can be used as an interface to TensoFlow for only doing computation on my GPU.
I tested TF directly on my GPU. But for ML purposes, I started using Keras, including the backend. I would find it 'comfortable' to do all my stuff in Keras instead of Using two tools.
This is also a matter of curiosity.
I found some examples like this one:
http://christopher5106.github.io/deep/learning/2018/10/28/understand-batch-matrix-multiplication.html
However this example does not actually do the calculation.
It also does not get input data.
I duplicate the snippet here:
'''
from keras import backend as K
a = K.ones((3,4))
b = K.ones((4,5))
c = K.dot(a, b)
print(c.shape)
'''
I would simply like to know if I can get the result numbers from this snippet above, and how?
Thanks,
Michel
Keras doesn't have an eager mode like Tensorflow, and it depends on models or functions with "placeholders" to receive and output data.
So, it's a little more complicated than Tensorflow to do basic calculations like this.
So, the most user friendly solution would be creating a dummy model with one Lambda layer. (And be careful with the first dimension that Keras will insist to understand as a batch dimension and require that input and output have the same batch size)
def your_function_here(inputs):
#if you have more than one tensor for the inputs, it's a list:
input1, input2, input3 = inputs
#if you don't have a batch, you should probably have a first dimension = 1 and get
input1 = input1[0]
#do your calculations here
#if you used the batch_size=1 workaround as above, add this dimension again:
output = K.expand_dims(output,0)
return output
Create your model:
inputs = Input(input_shape)
#maybe inputs2 ....
outputs = Lambda(your_function_here)(list_of_inputs)
#maybe outputs2
model = Model(inputs, outputs)
And use it to predict the result:
print(model.predict(input_data))
I am new in Keras, but I worked with pure tensorflow before. I am trying to debug some of the following network (I will just copy a fragment. Loss function, optimizer, etc are unimportant to me for with this code)
#Block 1 (Conv,relu,batch) starts with 800 x 400
main_input = LNN.Input(shape=((800,400,5)),name='main_input')
enc_conv1 = LNN.Convolution2D(8,3,padding='same',activation='relu')(main_input)
enc_bn1 = LNN.BatchNormalization(axis=1)(enc_conv1)
#Block 2 (Conv,relu,batch) starts with 400 x 200
maxp1_4 = LNN.MaxPooling2D(strides=2)(enc_bn1)
enc_conv2 = LNN.Convolution2D(16,3,padding='same',activation='relu')(maxp1_4)
enc_bn2 = LNN.BatchNormalization(axis=1)(enc_conv2)
enc_conv3 = LNN.Convolution2D(16,3,padding='same',activation='relu')(enc_bn2)
enc_bn3 = LNN.BatchNormalization(axis=1)(enc_conv3)
concat1_5 = LNN.concatenate(axis=3,inputs=[enc_bn3,maxp1_4])
I have seen some examples of how to do it adding each operation to a Sequential() function (for example as the one explained here but with the add() function. Is there a way to check the output of each layer without adding them to a model itself (as it can be also done with Tensorflow, making a session)?
The best is to make a model that outputs those layers:
modelToOutputAll = Model(main_input, [enc_conv1, enc_bn1, maxp1_4, enc_conv2, enc_bn2, enc_conv3, enc_bn3, concat1_5])
For training, keep a model with only the final output:
modelForTraining = Model(main_input,concat1_5)
Both models are using the exact same weights, so training one changes the other. You use each one for doing what you need at the moment.
Train with modelForTraining.fit(xTrain,yTrain, ...)
See intermediate layers with modelToOutputAll.predict(xInput)