I need help creating a custom metric callback that Keras can track during training. I'm running:
Windows 10
Python 3.6
scikit-learn==0.23.2
pandas==0.25.3
numpy==1.18.5
tensorflow==2.3.0
keras==2.4.3
The formula I want to use looks like this:
step_1 = (True_Positives - False_Positives) / Sum_of_y_true
result = (step_1 -- 1)/(1 -- 1) # For scaling range of (-1, 1) to (0, 1)
I know Keras offers the TruePositives() and FalsePositives() classes, so I'd like to take advantage of that in a custom function that can be used as a callback, pseudo-code I imagine would look something like:
def custom_metric():
Get True_Positives
Get False_Positives
Get Sum_of_y_true
Perform the above formula
Return that result into a "tensor" friendly form that can be used for callback
Or maybe this could be a one-liner return, I don't know. I'm unclear about how to make a custom metric "Keras friendly", as it doesn't appear to like numpy arrays or just regular float numbers?
Thanks!
UPDATE
What I've attempted so far looks like this. Not sure if it's correct but would like to know if I'm on the right track:
def custom_metric(y_true, y_pred):
TP = np.logical_and(backend.eval(y_true) == 1, backend.eval(y_pred) == 1)
FP = np.logical_and(backend.eval(y_true) == 0, backend.eval(y_pred) == 1)
TP = backend.sum(backend.variable(TP))
FP = backend.sum(backend.variable(FP))
SUM_TRUES = backend.sum(backend.eval(y_true) == 1)
# Need help with this part?
result = (TP-FP)/SUM_TRUES
result = (result -- 1)/(1--1)
return result
Figured it out!
def custom_m(y_true, y_pred):
true_positives = backend.sum(backend.round(backend.clip(y_true * y_pred, 0, 1)))
predicted_positives = backend.sum(backend.round(backend.clip(y_pred, 0, 1)))
false_positives = predicted_positives - true_positives
possible_positives = backend.sum(backend.round(backend.clip(y_true, 0, 1)))
step_1 = (true_positives - false_positives) / possible_positives
result = (step_1 -- 1)/(1 -- 1)
return result
Related
I am trying to convert a nested loop over a numpy array into a numpy-optimized implementation.
The function being called inside the loop takes a 4D vector and a separate parameter, and outputs a 4D vector which is supposed to replace the old 4D vector based on operations with the new value. If relevant, the function a Welford online update which updates mean and standard deviation based on a new value, with the 4D vector being [old_mean, old_std, old_s, num_values]. For each pixel channel, I am saving these values in the history_array for updating the distribution based on future pixel values.
My present code looks like this:
def welford_next(arr:np.ndarray, new_point:np.float32) -> np.ndarray:
old_mean, _, old_s, num_points = arr
num_points += 1
new_mean = old_mean + (new_point - old_mean) / num_points
new_s = old_s + (new_point - old_mean) * (new_point - new_mean)
return [new_mean, np.sqrt(new_s / num_points) if num_points > 1 else new_s, new_s, num_points]
updates = [10., 20., 30., 40., 90., 80.]
history_array = np.zeros(shape = b.shape + (4,)) # shape: [6,3,3,4]
print(f'History Shape: {history_array.shape}')
history_array_2 = np.zeros_like(history_array)
for update in updates:
image = np.empty(shape = b.shape) # shape: [6,3,3] (h x w x c)
image.fill(update)
for i, row in enumerate(image): # Prohibitively expensive
for j, col in enumerate(row):
for k, channel in enumerate(col):
history_array[i][j][k] = welford_next(history_array[i][j][k], channel)
history_array_2 = np.apply_along_axis(welford_next, axis=2, arr=history_array_2)
print(history_array == history_array_2)
However, the np.apply_along_axis() is not seem to be viable because it does not allow additional parameters to be passed alongside the array itself.I also came across np.ufunc which the welford_next() function can be converted to using np.frompyfunc() but it is unclear how it could help me reach the desired target.
How do I achieve this looped operation using numpy?
The numpy optimized way to do this would be to change the way we use the welford_next() function. As mentioned in the comments, repeated calls to a function cannot be optimized, thus the function call needs to be limited to once per frame and optimization needs to be done inside the function itself. The following implementation works ~ 50x faster.
def welford(history:np.ndarray, frame:np.ndarray) -> np.ndarray:
old_mean, _, old_s, num_points = np.transpose(history, [3,0,1,2])
num_points += 1.
new_mean = old_mean + (frame - old_mean) / num_points
new_s = old_s + (frame - old_mean) * (frame - new_mean)
new_std = np.sqrt(new_s / num_points) if num_points[0][0][0] > 1 else new_s
return np.transpose(np.array([new_mean, new_std, new_s, num_points]), [1,2,3,0])
updates = [10., 20., 30., 40., 90., 80.]
history_array = np.zeros(shape = b.shape + (4,)) # shape: [6,3,3,4]
for update in updates:
image = np.empty(shape = b.shape) # shape: [6,3,3] (h x w x c)
image.fill(update)
history_array = welford(history_array, image)
I have an optimization problem and I'm solving it with scipy and the minimization module. I uses SLSQP as method, because it is the only one, which fits to my problem. The function to optimize is a cost function with 'x' as a list of percentages. I have some constraints which has to be respected:
At first, the sum of the percentages should be 1 (PercentSum(x)) This constrain is added as 'eg' (equal) as you can see in the code.
The second constraint is about a physical value which must be less then 'proberty1Max '. This constrain is added as 'ineq' (inequal). So if 'proberty1 < proberty1Max ' the function should be bigger than 0. Otherwise the function should be 0. The functions is differentiable.
Below you can see a model of my try. The problem is the 'constrain' function. I get solutions, where the sum of 'prop' is bigger than 'probertyMax'.
import numpy as np
from scipy.optimize import minimize
class objects:
def __init__(self, percentOfInput, min, max, cost, proberty1, proberty2):
self.percentOfInput = percentOfInput
self.min = min
self.max = max
self.cost = cost
self.proberty1 = proberty1
self.proberty2 = proberty2
class data:
def __init__(self):
self.objectList = list()
self.objectList.append(objects(10, 0, 20, 200, 2, 7))
self.objectList.append(objects(20, 5, 30, 230, 4, 2))
self.objectList.append(objects(30, 10, 40, 270, 5, 9))
self.objectList.append(objects(15, 0, 30, 120, 2, 2))
self.objectList.append(objects(25, 10, 40, 160, 3, 5))
self.proberty1Max = 1
self.proberty2Max = 6
D = data()
def optiFunction(x):
for index, obj in enumerate(D.objectList):
obj.percentOfInput = x[1]
costSum = 0
for obj in D.objectList:
costSum += obj.cost * obj.percentOfInput
return costSum
def PercentSum(x):
y = np.sum(x) -100
return y
def constraint(x, val):
for index, obj in enumerate(D.objectList):
obj.percentOfInput = x[1]
prop = 0
if val == 1:
for obj in D.objectList:
prop += obj.proberty1 * obj.percentOfInput
return D.proberty1Max -prop
else:
for obj in D.objectList:
prop += obj.proberty2 * obj.percentOfInput
return D.proberty2Max -prop
def checkConstrainOK(cons, x):
for con in cons:
y = con['fun'](x)
if con['type'] == 'eq' and y != 0:
print("eq constrain not respected y= ", y)
return False
elif con['type'] == 'ineq' and y <0:
print("ineq constrain not respected y= ", y)
return False
return True
initialGuess = []
b = []
for obj in D.objectList:
initialGuess.append(obj.percentOfInput)
b.append((obj.min, obj.max))
bnds = tuple(b)
cons = list()
cons.append({'type': 'eq', 'fun': PercentSum})
cons.append({'type': 'ineq', 'fun': lambda x, val=1 :constraint(x, val) })
cons.append({'type': 'ineq', 'fun': lambda x, val=2 :constraint(x, val) })
solution = minimize(optiFunction,initialGuess,method='SLSQP',\
bounds=bnds,constraints=cons,options={'eps':0.001,'disp':True})
print('status ' + str(solution.status))
print('message ' + str(solution.message))
checkConstrainOK(cons, solution.x)
There is no way to find a solution, but the output is this:
Positive directional derivative for linesearch (Exit mode 8)
Current function value: 4900.000012746761
Iterations: 7
Function evaluations: 21
Gradient evaluations: 3
status 8
message Positive directional derivative for linesearch
Where is my fault? In this case it ends with mode 8, because the example is very small. With bigger data the algorithm ends with mode 0. But I think it should ends with a hint that an constraint couldn't be hold.
It doesn't make a difference, if proberty1Max is set to 4 or to 1. But in the case it is 1, there could not be a valid solution.
PS: I changed a lot in this question... Now the code is executable.
EDIT:
1.Okay, I learned, an inequal constrain is accepted if the output is positiv (>0). In the past I think <0 would also be accepted. Because of this the constrain function is now a little bit shorter.
What about the constrain. In my real solution I add some constrains using a loop. In this case it is nice to feed a function with an index of the loop and in the function this index is used to choose an element of an array. In my example here, the "val" decides if the constrain is for proberty1 oder property2. What the constrain mean is, how much of a property is in the hole mix. So I'm calculating the property multiplied with the percentOfInput. "prop" is the sum of this over all objects.
I think there might be a connection to the issue tux007 mentioned in the comments. link to the issue
I think the optimizer doesn't work correct, if the initial guess is not a valid solution.
Linear programming is not good for overdetermined equations. My problem doesn't have a unique solution, its an approximation.
As mentioned in the comment I think this is the problem:
Misleading output from....
If you have a look at the latest changes, the constrain is not satisfied, but the algorithm says: "Positive directional derivative for linesearch"
I have a Set of Rotation Matrices Rs:
Rs.shape = [62x3x3]
And a Set of Translation Components Js:
Js.shape = [62x3]
I have been trying to find an efficient way to combine them into a [62x4x4] matrix which is 62 homogenous transform matrices. Currently I am doing it with a stupid for loop:
def make_A(R, t):
R_homo = torch.cat([R, torch.zeros(1, 3).cuda()], dim = 0)
t_homo = torch.cat([t.view(3,1), torch.ones(1, 1).cuda()], dim = 0)
return torch.cat([R_homo, t_homo], dim=1)
transforms = self.NUM_JOINTS*[None]
for idj in range(0, self.NUM_JOINTS):
transforms[idj] = make_A(Rs[idj, :], Js[idj,:])
FinalMatrix = torch.stack(transforms, dim=0)
This is highly inefficient, and takes almost 10ms to form. How can I tensorize this?
Not sure if it helps efficiency, but this should vectorize your code:
def make_A(Rs, Js):
R_homo = torch.cat((Rs, torch.zeros(Rs.shape[0], 1, 3)), dim=1)
t_homo = torch.cat((Js, torch.ones(Js.shape[0], 1)), dim=1)
return torch.cat((R_homo, t_homo.unsqueeze(2)), dim=2)
I'm trying to write a hook that will allow me to compute some global metrics (rather than batch-wise metrics). To prototype, I thought I'd get a simple hook up and running that would capture and remember true positives. It looks like this:
class TPHook(tf.train.SessionRunHook):
def after_create_session(self, session, coord):
print("Starting Hook")
tp_name = 'metrics/f1_macro/TP'
self.tp = []
self.args = session.graph.get_operation_by_name(tp_name)
print(f"Got Args: {self.args}")
def before_run(self, run_context):
print("Starting Before Run")
return tf.train.SessionRunArgs(self.args)
def after_run(self, run_context, run_values):
print("After Run")
print(f"Got Values: {run_values.results}")
However, the values returned in the "after_run" part of the hook are always None. I tested this in both the train and evaluation phase. Am I misunderstanding something about how the SessionRunHooks are supposed to work?
Maybe relevant information:
The model was build in keras and converted to an estimator with the keras.estimator.model_to_estimator() function. The model has been tested and works fine, and the op that I'm trying to retrieve in the hook is defined in this code block:
def _f1_macro_vector(y_true, y_pred):
"""Computes the F1-score with Macro averaging.
Arguments:
y_true {tf.Tensor} -- Ground-truth labels
y_pred {tf.Tensor} -- Predicted labels
Returns:
tf.Tensor -- The computed F1-Score
"""
y_true = K.cast(y_true, tf.float64)
y_pred = K.cast(y_pred, tf.float64)
TP = tf.reduce_sum(y_true * K.round(y_pred), axis=0, name='TP')
FN = tf.reduce_sum(y_true * (1 - K.round(y_pred)), axis=0, name='FN')
FP = tf.reduce_sum((1 - y_true) * K.round(y_pred), axis=0, name='FP')
prec = TP / (TP + FP)
rec = TP / (TP + FN)
# Convert NaNs to Zero
prec = tf.where(tf.is_nan(prec), tf.zeros_like(prec), prec)
rec = tf.where(tf.is_nan(rec), tf.zeros_like(rec), rec)
f1 = 2 * (prec * rec) / (prec + rec)
# Convert NaN to Zero
f1 = tf.where(tf.is_nan(f1), tf.zeros_like(f1), f1)
return f1
In case anyone runs into the same problem, I found out how to restructure the program so that it worked. Although the documentation makes it sound like I can pass raw ops into the SessionRunArgs, it seems like it requires actual tensors (maybe this is a misreading on my part).
This is pretty easy to accomplish - I just changed the after_create_session code to what's shown below.
def after_create_session(self, session, coord):
tp_name = 'metrics/f1_macro/TP'
self.tp = []
tp_tensor = session.graph.get_tensor_by_name(tp_name+':0')
self.args = [tp_tensor]
And this successfully runs.
I have a function that I define as follows
def NewLoss(y_true,y_pred):
p=0
for i in range(3074):
if (y_pred[i+1]-y_pred[i])<0:
p+=(y_true[i]-y_pred[i])**2
elif (y_pred[i+1]-y_pred[i])>0:
p+=(y_true[i]-y_pred[i])**2+(y_true[i]-y_pred[i])*(y_pred[i+1]-y_pred[i])**2
else:
p+=(y_true[i]-y_pred[i])**2+0.5*(y_true[i]-y_pred[i])*(y_pred[i+1]-y_pred[i])**2
return p
My y_true and y_pred are vectors. When I try to run a code that calls this function, I get the following error:
"Using a tf.Tensor as a Python bool is not allowed".
I would like to know how to check the sign of (y_true[i]-y_pred[i]) and avoid this error, I am actually using keras.
Thank you very much for your help.
def NewLoss(y_true, y_pred):
true = y_true[:3074]
pred = y_pred[:3074]
predShifted = y_pred[1:3075]
diff = true - pred
diffShifted = predShifted - pred
pLeftPart = K.square(diff)
pRightPart = diff * K.square(diffShifted)
greater = K.cast(K.greater(diffShifted,0),K.floatx())
equal = 0.5 * K.cast(K.equal(diffShifted, 0), K.floatx())
mask = greater + equal
return K.sum(pLeftPart + (mask*pRightPart))
Remarks:
1 - The first axis is the samples axis, perhaps you're trying to do this with the timesteps axis? If so, use:
true = y_true[:,:3074]
pred = y_pred[:,:3074]
predShifted = y_pred[:,1:3075]
2 - Having differences exactly equal to zero is so rare that maybe you don't need the last part of the if statement.
3 - If the max length of your tensors is 3075, you can simplify the selections:
true = y_true[:-1]
pred = y_pred[:-1]
predShifted = y_pred[1:]