Custom Metrics Keras [duplicate] - keras

This question already has answers here:
how to implement custom metric in keras?
(3 answers)
Closed 3 years ago.
i need help to create a custom metrics in keras. I need to count how many times my error is equal to zero (y_pred - y_true = 0).
I tried this:
n_train = 1147 # Number of samples on training set
c = 0 # Variable to count
def our_metric(y_true, y_pred):
if y_true-y_pred == 0:
c += 1
return c/n_train
But i'm getting this error:
OperatorNotAllowedInGraphError: using a tf.Tensor as a Python bool
is not allowed in Graph execution. Use Eager execution or decorate
this function with #tf.function.
EDIT: Using the solution proposed here:
Creating custom conditional metric with Keras
I solved my problem as this:
c = tf.constant(0)
def our_metric(y_true, y_pred):
mask = K.equal(y_pred, y_true) # TRUE if y_pred = y_true
mask = K.cast(mask,K.floatx())
s = K.sum(mask)
return s/n_train

You can't run Python comparison in plain tensorflow (using static graphs).
You have to enable eager mode, a wrapper which let's you use some Python control statements (like if or loop). Just decorate your function as the error suggests or issue tf.enable_eager_execution() at the beginning of your script.
You may also want to update your code to use tf2.0, it's more intuitive and has eager mode on by default.

There are numerous ways to use Keras backend functions to count the number of times a value is equal to zero. You just have to think a bit outside of the box. Here is an example:
diff = y_true - y_pred
count = K.sum(K.cast(K.equal(diff, K.zeros_like(diff)), 'int8'))
There's also a tf.count_nonzero operation that could be used, but mixing keras and explicit tensorflow can cause issues.

Related

How to use Keras Conv2D layers with OpenAI gym?

Using OpenAI's gym environment, I've created my own environment in which the observation space of box type, and the shape is (21,21,1).
The intention is to use a keras Conv2D layer as the model's input. Ideally, the shape going into this model would be (None,21,21,1), with None representing the batch size. Kera's documentation is here: https://keras.io/api/layers/convolution_layers/convolution2d/
The issue I'm having is that an extra dimension is being required while checking the shaping. Because of this, the shape it expects is (None,1,21,21,1). This is prohibiting me from using MaxPooling layers in the model. After investigating the keras RL library, this is due to two functions that are adding this dimensionality.
The first function is found in memory.py, where a current observation is put into a list and returned as such. Here:
def get_recent_state(self, current_observation):
"""Return list of last observations
# Argument
current_observation (object): Last observation
# Returns
A list of the last observations
"""
# This code is slightly complicated by the fact that subsequent observations might be
# from different episodes. We ensure that an experience never spans multiple episodes.
# This is probably not that important in practice but it seems cleaner.
state = [current_observation]
idx = len(self.recent_observations) - 1
for offset in range(0, self.window_length - 1):
current_idx = idx - offset
current_terminal = self.recent_terminals[current_idx - 1] if current_idx - 1 >= 0 else False
if current_idx < 0 or (not self.ignore_episode_boundaries and current_terminal):
# The previously handled observation was terminal, don't add the current one.
# Otherwise we would leak into a different episode.
break
state.insert(0, self.recent_observations[current_idx])
while len(state) < self.window_length:
state.insert(0, zeroed_observation(state[0]))
return state
The second function is called just after and computes the Q values based on the recent observation. It creates a list of the state when passing onto "compute_batch_q_values".
def compute_q_values(self, state):
q_values = self.compute_batch_q_values([state]).flatten()
assert q_values.shape == (self.nb_actions,)
return q_values
I understand that one extra dimension should be added to represent the batch size, but is it twice? Can anyone explain why this is or how to use Conv2d layers with OpenAI gym?
Thanks.

Getting a Scoring Function by Name in scikit-learn

In scikit-learn , there is the notion of a scoring function. If we have some predicted labels and the true labels, we can get to the score by calling scoring(y_true, y_predict). An example of such scoring function is sklearn.metrics.accuracy_score.
A scoring function is not to be confused of the scorer, which is an object that can be called as scorer(estimator, X, y_true).
There are many builtin scorers in scikit-learn. It is possible to get to these scorers by their string names. For example, we can get the scorer corresponding to the name 'accuracy' by calling sklearn.metrics.get_scorer("accuracy")/
But it turns out that there is no obvious mechanism to access the built-in scoring functions by their names at run-time, through passing in the name as a string. For example, there is no way to access sklearn.metrics.accuracy_score by its name accuracy.
For example, if at run time, the program knows the name of the scoring function is contained in variable name, I am looking for a mechanism get_scoring_function(), such that, get_scoring_function(name) will return the scoring function handle. Note that this name, name, is not known at scripting time.
Is there any way to access the built-in scoring functions by their names at run time through passing in the names as strings?
You can use the get_scorer() function, which accepts a string as an argument, and then get the _score_func attribute of the returned object.
So for example
from sklearn.metrics import get_scorer
get_scorer('accuracy')._score_func(y_true, y_pred)
is equivalent to
from sklearn.metrics import accuracy_score
accuracy_score(y_true, y_pred)
I myself faced this task, and I haven't found a better way to access metrics by names than with sklearn.metrics.get_scorer function, but the drawback of it is that you have to pass an estimator there, not predictions. I tried to use the #collinb9 recommendation, but you see, you have to access a protected method there, and in my case, it led to unpleasant consequences, namely incorrectly calculated metrics.
This is a short example showing this problem.
from sklearn import datasets, model_selection, linear_model, metrics
features, labels = datasets.make_regression(1000, random_state=123)
train_features, test_features, train_labels, test_labels = model_selection.train_test_split(features, labels, test_size=0.1, random_state=567)
model = linear_model.LinearRegression()
model.fit(train_features, train_labels)
print(f'variant 1 neg_mse = {metrics.get_scorer("neg_mean_squared_error")(model, test_features, test_labels)}')
print(f'variant 1 neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")(model, test_features, test_labels)}\n')
preds = model.predict(test_features)
print(f'variant 2 mse = {metrics.mean_squared_error(test_labels, preds)}')
print(f'variant 2 rmse = {metrics.mean_squared_error(test_labels, preds, squared=False)}\n')
print(f'protected neg_mse = {metrics.get_scorer("neg_mean_squared_error")._score_func(test_labels, preds)}')
print(f'protected neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")._score_func(test_labels, preds)}')
The output of this program will be:
variant 1 neg_mse = -2.142587870436064e-25
variant 1 neg_rmse = -4.628809642268803e-13
variant 2 mse = 2.142587870436064e-25
variant 2 rmse = 4.628809642268803e-13
protected neg_mse = 2.142587870436064e-25
protected neg_rmse = 2.142587870436064e-25
You see, metrics calculated with the use of the protected method differ. First, we ordered to get negative values, but got positive ones (it should be mentioned, that for variant 2 metrics we didn't imply negative values). Second, the neg_mse and neg_rmse values are equal but should be different.
If we go to the source code of sklearn metrics, we will see:
This is how _score_func is called: it is multiplied by sign, so that's where we lose our negative values.
This is how scorers are made: you see, neg_root_mean_squared_error_scorer has extra parameter squared=False. This parameter is stated explicitly as an optional one in metrics.mean_squared_error, so you won't make a mistake. We can pass this parameter as a keyword argument to _score_fun and at least we will get a correct absolute value then:
print(f'protected neg_rmse = {metrics.get_scorer("neg_root_mean_squared_error")._score_func(test_labels, preds, squared=False)}')
protected neg_rmse = 4.628809642268803e-13
To make things short, I've shown, to my knowledge, the only way to get sklearn metrics by name (btw, you can find the full list of names here), and that it's not safe to use protected methods that you're not supposed to use. BTW, I was using sklearn version=0.24.2.
Since the documentation is incomplete, you'll have to go directly to the source code here for the complete list of metric names:
Metric Names
Search for __all__.
Answer of #collinb9 should not be accepted as it would lead to incorrect calculations.
You need other arguments (such as squared:False for rmse) to compute the correct thing. They can be accessed via the _kwargs attribute of _BaseScorer class. If you combine _score_func and _kwargs then we can get the corresponding scorer function.
The full answer to the question should be:
import functools
import sklearn
def score(scoring_name, y_true, y_pred):
sklearn_scorer = sklearn.metrics.get_scorer(scoring_name)
return sklearn_scorer._sign * sklearn_scorer._score_func(
y_true=y_true, y_pred=y_pred, **sklearn_scorer._kwargs
)
score("neg_root_mean_squared_error", y_true, y_pred)

How to resolve KeyError: 'val_mean_absolute_error' Keras 2.3.1 and TensorFlow 2.0 From Chollet Deep Learning with Python

I am on section 3.7 of Chollet's book Deep Learning with Python.
The project is to find the median price of homes in a given Boston suburbs in the 1970's.
https://github.com/fchollet/deep-learning-with-python-notebooks/blob/master/3.7-predicting-house-prices.ipynb
At section "Validating our approach using K-fold validation" I try to run this block of code:
num_epochs = 500
all_mae_histories = []
for i in range(k):
print('processing fold #', i)
# Prepare the validation data: data from partition # k
val_data = train_data[i * num_val_samples: (i + 1) * num_val_samples]
val_targets = train_targets[i * num_val_samples: (i + 1) * num_val_samples]
# Prepare the training data: data from all other partitions
partial_train_data = np.concatenate(
[train_data[:i * num_val_samples],
train_data[(i + 1) * num_val_samples:]],
axis=0)
partial_train_targets = np.concatenate(
[train_targets[:i * num_val_samples],
train_targets[(i + 1) * num_val_samples:]],
axis=0)
# Build the Keras model (already compiled)
model = build_model()
# Train the model (in silent mode, verbose=0)
history = model.fit(partial_train_data, partial_train_targets,
validation_data=(val_data, val_targets),
epochs=num_epochs, batch_size=1, verbose=0)
mae_history = history.history['val_mean_absolute_error']
all_mae_histories.append(mae_history)
I get an error KeyError: 'val_mean_absolute_error'
mae_history = history.history['val_mean_absolute_error']
I am guessing the solution is figure out the correct parameter to replace val_mean_absolute_error. I've tried looking into some Keras documentation for what would be the correct key value. Anyone know the correct key value?
The problem in your code is that, when you compile your model, you do not add the specific 'mae' metric.
If you wanted to add the 'mae' metric in your code, you would need to do like this:
model.compile('sgd', metrics=[tf.keras.metrics.MeanAbsoluteError()])
model.compile('sgd', metrics=['mean_absolute_error'])
After this step, you can try to see if the correct name is val_mean_absolute_error or val_mae. Most likely, if you compile your model like I demonstrated in option 2, your code will work with "val_mean_absolute_error".
Also, you should also put the code snippet where you compile your model, it is missing in the question text from above(i.e. the build_model() function)
I replaced 'val_mean_absolute_error' with 'val_mae' and it worked for me
FYI, I had the same problem that persisted even after changing the line history.history['val_mae'] as described in the answer.
In my case, in order for the val_mae dict object to be present in history.history object, I needed to ensure that the model.fit() code included the 'validation_data = (val_data, val_targets)' argument. I neglected to do this initially.
I update it by below code line:
mae_history = history.history["mae"]
History object should contain the same names as what you compile.
For example:
mean_absolute_error gives val_mean_absolute_error
mae gives val_mae
accuracy gives val_accuracy
acc gives val_acc

How to set initial weights in MLPClassifier?

I cannot find a way to set the initial weights of the neural network, could someone tell me how please?
I am using python package sklearn.neural_network.MLPClassifier.
Here is the code for reference:
from sklearn.neural_network import MLPClassifier
classifier = MLPClassifier(solver="sgd")
classifier.fit(X_train, y_train)
Solution:
A working solution is to inherit from MLPClassifier and override the _init_coef method. In the _init_coef write the code to set the initial weights.
Then use the new class "MLPClassifierOverride" as in the example below instead of "MLPClassifier"
# new class
class MLPClassifierOverride(MLPClassifier):
# Overriding _init_coef method
def _init_coef(self, fan_in, fan_out):
if self.activation == 'logistic':
init_bound = np.sqrt(2. / (fan_in + fan_out))
elif self.activation in ('identity', 'tanh', 'relu'):
init_bound = np.sqrt(6. / (fan_in + fan_out))
else:
raise ValueError("Unknown activation function %s" %
self.activation)
coef_init = ### place your initial values for coef_init here
intercept_init = ### place your initial values for intercept_init here
return coef_init, intercept_init
The docs show you the attributes in use.
Attributes:
...
coefs_ : list, length n_layers - 1
The ith element in the list represents the weight matrix corresponding to > layer i.
intercepts_ : list, length n_layers - 1
The ith element in the list represents the bias vector corresponding to layer > i + 1.
Just build your classifier clf=MLPClassifier(solver="sgd") and set coefs_ and intercepts_ before calling clf.fit().
The only remaining question is: does sklearn overwrite your inits?
The code looks like:
if not hasattr(self, 'coefs_') or (not self.warm_start and not
incremental):
# First time training the model
self._initialize(y, layer_units)
This looks to me like it won't replace your given coefs_ (you might check biases too).
The packing and unpacking functions further indicates that this should be possible. These are probably used for serialization through pickle internally.
multilayer_perceptron.py initializes the weights based on the nonlinear function used for hidden layers. If you want to try a different initialization, you can take a look at the function _init_coef here and modify as you desire.

New to theano. Trying to add a term to a loss function to penalize negative weights

To be clear, by weights I mean the entries in the matrices (Ws) of the affine transformation in a node of a neural net.
I start with categorical_crossentropy as my loss function. And I want to add an additional term to penalize negative weights.
To this end I want to introduce a term of the form
theano.tensor.sum(theano.tensor.exp(-10 * ws))
Where "ws" are the weights.
If I follow the source code of categorical_crossentropy:
if true_dist.ndim == coding_dist.ndim:
return -tensor.sum(true_dist *tensor.log(coding_dist), axis=coding_dist.ndim - 1)
elif true_dist.ndim == coding_dist.ndim - 1:
return crossentropy_categorical_1hot(coding_dist, true_dist)
else:
raise TypeError('rank mismatch between coding and true distributions')
Seems like I should update the third line (from the bottom) to read
crossentropy_categorical_1hot(coding_dist, true_dist) + theano.tensor.sum(theano.tensor.exp(- 10 * ws))
And change the declaration of the function to be
my_categorical_crossentropy(coding_dist, true_dist, ws) Where in calling for my_categorical_crossentropy I write
loss = my_categorical_crossentropy(net_output, true_output, l_layers[1].W)
with, for a start, l_layers[1].W to be the weights coming from the first layer of my neural net.
With those updates, I go on writing:
loss = aggregate(loss, mode = 'mean')
updates = sgd(loss, all_params, learning_rate = 0.005)
train = theano.function([l_input.input_var, true_output], loss, updates = updates)
[...]
This passes the compiler and everything runs smoothly, the training of the network completes. However, for some reason the additional term " theano.tensor.sum(theano.tensor.exp(- 10 * ws)) is ignored, it seems not to effect the loss value.
I was trying to look into Theano documentation, but so far I could not figure out what might be wrong? The weighs l_layers[1].W are shared variables, so I could not pass those as
train = theano.function([l_input.input_var, true_output, l_layers[1].W], loss, updates = updates)
Any comments are welcome. Thanks!
Solution
Though, I didn't find why what I did, didn't work, adding the penalty term outside the 'categorical_crossentropy' as suggested in the comments did solve the problem:
loss = aggregate(categorical_crossentropy(net_output, true_output) + theano.tensor.sum(theano.tensor.exp(- 10 * l_layers[1].W))

Resources