Why does this code work fine for the loss function but the metrics fail after one iteration with "ValueError: operands could not be broadcast together with shapes (32,) (24,) (32,)"?
If I use "categorical_crossentropy" in quotes then it works. And my custom metric looks identical to the one in keras.losses.
import keras.backend as K
def categorical_crossentropy(y_true, y_pred):
return K.categorical_crossentropy(y_pred, y_true)
fc.compile(optimizer=Adam(.01), loss=categorical_crossentropy, metrics=[categorical_crossentropy])
fc.fit(xtrain, ytrain, validation_data=(xvalid, yvalid), verbose=0,
callbacks=[TQDMNotebookCallback(leave_inner=True, leave_outer=True)],
nb_epoch=2)
It works if I import categorical_crossentropy from keras.metrics; rather than importing K. Still no idea why the above doesn't work but at least this is a solution.
Also it looks like the loss function is not necessary in metrics parameter anyway as it is automatically calculated and shown for training and validation.
Related
I have a highly imbalanced data set from which I want to get both classification (binary) as well as probabilities. I have managed to use logistic regression as well as random forest to obtain results from cross_val_predict using class weights.
I am aware that RandomForestClassifier and LogisiticRegression can take class weight as an argument while KNeighborsRegressor and GaussianNB do not. However, for KNN and NB in the documentation it says that for that I can use fit which incorporates sample weights:
fit(self, X, y, sample_weight=None)
So I was thinking of working around it by calculating class weights and using these to create an array of sample weights depending on the classification of the sample. Here is the code for that:
c_w = class_weight.compute_class_weight('balanced', np.unique(y), y)
sw=[]
for i in range(len(y)):
if y[i]==False:
sw.append(c_w[0])
else:
sw.append(c_w[1])
Not sure if this workaround makes sense, however I managed to fit the model using this method and I seem to get better results in terms of my smaller class.
The issue now is that I want to use this method in sklearn's
cross_val_predict()
however I am not managing to pass sample weights through cross validation.
I have 2 questions:
Does my workaround to use sample weights to substitute class weights make sense?
Is there a way to pass sample weights through cross_val_predict just like you would when you use fit without cross validation?
please see the response for this post for the description of sample and class weights difference. Ingeneral if you use class weights, you "make your model aware" of class imbalance. If you use sample weights you make your model aware that some samples must be "considered more carefully" or not taken into account at all.
fit_params argument should do the job, see here:
fit_params : dict, defualt=None - parameters to pass to the fit method of the estimator.
I use custom loss function in keras. Now, I want to use sample weights in Keras.
I've searched in google and some article suggest model.fit(X,y,sample_weight= custom_weights)
But I want to use sample weight directly in custom loss function. My custom loss function quite complex and for some reason i need to process sample weight directly.
for example:
custom_weights = np.array([1,2,3,4,5,6,7,8,9,10])
#my failed attempt
def custom_loss_function(y_true, y_pred , custom_weights):
return K.mean(K.abs(y_pred - y_true) * custom_weights), axis=-1)
note: my real custom_loss_function is very complex. In this question, I use "MAE" as example to simplify the problem so we can focus to answer "how to use sample weights in custom_loss_function "
how to do this task correctly ?
In Keras I often see people compile a model with mean square error function and "acc" as metrics.
model.compile(optimizer=opt, loss='mse', metrics=['acc'])
I have been reading about acc and I can not find an algorithm for it?
What if I would change my loss function to binary crossentropy for an example and use 'acc' as metrics? Would this be the same metrics as in first case or Keras changes this acc based on loss function - so binary crossentropy in this case?
Check the source code from line 375. The metric_fn change dependent on loss function, so it is automatically handled by keras.
If you want to compare models using different loss function it could in some cases be necessary to specify what accuracy method you want to grade your model with, such that the models actually are tested with the same tests.
My tremendously stripped-down code looks like:
#!/usr/bin/python3
from keras.layers import Input
from keras.layers.core import Dense
from keras.models import Model
import numpy as np
inp = Input(shape=[1])
out = Dense(units=1, activation='linear')(inp)
model = Model(inputs=inp, outputs=out)
model.compile(loss='mean_absolute_error',
optimizer='rmsprop')
x=np.array([[0]])
y=np.array([[42]])
model.fit(x,y,epochs=1000, verbose=False)
prediction = model.predict(x)
print(prediction)
It outputs [[1.0091327]]
The model has exactly two parameters: a weight and bias for its 1-dimensional output. And the weight doesn't matter because x is always 0. This should be pretty easy to train.
If instead of 42 I use 0.42 or -0.42 for y it works fine (4.2 and -42 do not). So I figure there must be some sort of normalization somewhere softly compressing either outputs or biases toward [-1,1].
Does anyone know what this normalization is and how to turn it off?
(Before anyone tells me I shouldn't use neural nets for something this silly, my real code does a lot more. I wrote this stripped version for clarity and debugging.)
No, there is no built-in normalization, that is the users job.
What you are seeing is the "why" we use normalization, without it the optimization problem is a lot harder, after I run the example you can see that the loss does not go anywhere close to zero and stays around 41.
If you make some changes like using a mean squared error loss and running this example for 50K epochs, then you get it to converge to a zero loss and it outputs 42 as expected.
A common beginner's mistake is to look at the prediction without looking first at the training loss, as the loss is high it will means the predictions will be wrong.
I am playing around with Keras and try to predict a word from within a context e.g. from a sentence "I have to say the food was tasty!" I hope to get something like this:
[say the ? was tasty] -> food, meals, spaghetti, drinks
However, my problem currently is that the network I am training appears to learn just the probabilities of the single words, and not the probabilities they have in a particular context.
Since the frequency of words is not balanced I thought I might/could/should apply weights to my loss function - which is currently the binary-cross entropy function.
I simply multiply the converse probability of each word with the error:
def weighted_binary_crossentropy(y_true, y_pred):
return K.mean(K.binary_crossentropy(y_pred, y_true) * (1-word_weights), axis=1)
This function is being used by the model as loss function:
model.compile(optimizer='adam', loss=weighted_binary_crossentropy)
However, my results are the exact same and I am not sure if just my model is broken or if I am using the loss paramter/function wrong.
is my weighted_binary_crossentropy() function doing what I just described? I asked because for some reason this works similar:
word_weights), axis=1)
Actually, as one may read in a documentation of a fit function, one may provide sample_weights which seem to be exactly what you want use.