Keras aggregated objective function - keras

How to add aggregated error to keras model?
Having table:
g x y
0 1 1 1
1 1 2 2
2 1 3 3
3 2 1 2
4 2 2 1
I would like to be able to minimize sum((y - y_pred) ** 2) error along with
sum((sum(y) - sum(y_pred)) ** 2) per group.
I'm fine to have bigger individual sample errors, but it is crucial for me to have right totals.
SciPy example:
import pandas as pd
from scipy.optimize import differential_evolution
df = pd.DataFrame({'g': [1, 1, 1, 2, 2], 'x': [1, 2, 3, 1, 2], 'y': [1, 2, 3, 2, 1]})
g = df.groupby('g')
def linear(pars, fit=False):
a, b = pars
df['y_pred'] = a + b * df['x']
if fit:
sample_errors = sum((df['y'] - df['y_pred']) ** 2)
group_errors = sum((g['y'].sum() - g['y_pred'].sum()) ** 2)
total_error = sum(df['y'] - df['y_pred']) ** 2
return sample_errors + group_errors + total_error
else:
return df['y_pred']
pars = differential_evolution(linear, [[0, 10]] * 2, args=[('fit', True)])['x']
print('SAMPLES:\n', df, '\nGROUPS:\n', g.sum(), '\nTOTALS:\n', df.sum())
Output:
SAMPLES:
g x y y_pred
0 1 1 1 1.232
1 1 2 2 1.947
2 1 3 3 2.662
3 2 1 2 1.232
4 2 2 1 1.947
GROUPS:
x y y_pred
g
1 6 6 5.841
2 3 3 3.179
TOTALS:
g 7.000
x 9.000
y 9.000
y_pred 9.020

For grouping, as long as you keep the same groups throughout training, your loss function will not have problems about being not differentiable.
As a naive form of grouping, you can simply separate the batches.
I suggest a generator for that.
#suppose you have these three numpy arrays:
gTrain
xTrain
yTrain
#create this generator
def grouper(g,x,y):
while True:
for gr in range(1,g.max()+1):
indices = g == gr
yield (x[indices],y[indices])
For the loss function, you can make your own:
import keras.backend as K
def customLoss(yTrue,yPred):
return K.sum(K.square(yTrue-yPred)) + K.sum(K.sum(yTrue) - K.sum(yPred))
model.compile(loss=customLoss, ....)
Just be careful with the second term if you have negative values.
Now you train using the method fit_generator:
model.fit_generator(grouper(gTrain,xTrain, yTrain), steps_per_epoch=gTrain.max(), epochs=...)

Related

Why is the function returning value of stack as None

I'm not able to figure out why the path variable on last line of the code is being printed out as None. As you can see the second last line of the code is calling the DFS function to find the path between two nodes in a tree (I'm giving a tree as input). I've printed out the value of the stack also before returning it to make sure that it is not None and while being printed inside DFS function it is not None. But I'm not able to understand why it is None when it is returned and stored in path variable. I gave this as input
1
6 1
4 2 1 3 5 2
1 2
2 3
2 4
1 5
5 6
And the out put came as
{0: [1, 4], 1: [0, 2, 3], 2: [1], 3: [1], 4: [0, 5], 5: [4]}
[0, 1, 3]
None
Here is the code for reference
def DFS(adj,x, y,stack,vis):
stack.append(x)
if (x == y):
print(stack)
return stack
vis[x] = 1
if (len(adj[x])>0):
for j in adj[x]:
if (vis[j]==0):
DFS(adj,j,y,stack,vis)
del stack[-1]
T = int(input())
for a in range(T):
N,Q = input().split()
N = int(N)
Q = int(Q)
wt = [int(num) for num in input().split(" ")]
adj = {}
for i in range(N):
adj[i] = []
for b in range(N-1):
u,v = input().split()
u = int(u) - 1
v = int(v) - 1
adj[u].append(v)
adj[v].append(u)
print(adj)
vis = [0]*N
stack = []
path = DFS(adj,0,3,stack,vis)
print(path)
Simple equivalent of your code:
def recursive_func(x):
if x > 0:
return x
else:
x += 1
recursive_func(x)
x = 5
x = recursive_func(x)
print(x)
x = 0
x = recursive_func(x)
print(x)
Output:
5
None
What's happening here?
x, with a value of 5 is sent to recursive_func.
x is greater than 0, so 5 is returned. This is seen in the output.
x, with a value of -5 is sent to recursive_func.
x is not greater than 0, so 1 is added to x.
x, with a value of 1, is then sent to a different recursive_func.
This recursive_func returns 1 because 1 > 0.
This response gets passed to the first recursive_func where the line recursive_func(x) becomes 1, but we don't do anything with it.
recursive_function hits the end of its code, without returning a value. By default None is returned to our main body.
x = recursive_func(x) has become x = None
None is output.
Given this information, why does the following code perform differently?
Simple modification of your code:
def recursive_func_v2(x):
if x > 0:
return x
else:
x += 1
return recursive_func_v2(x)
x = 5
x = recursive_func_v2(x)
print(x)
x = 0
x = recursive_func_v2(x)
print(x)
Output:
5
1

Group by mean for element with value >0

df=pd.DataFrame({"x":[1,2,3,0],"y":[1,1,1,1]})
df.groupby("y").agg(x_sum=("x",np.mean))
This code gives average of x, the output is 1.5 ((1+2+3+0)/4=1.5)
but I want average of x where the number of larger than 0, so the output should be (1+2+3)/3=2.
How should I address it?
Replace not greater like 0 in x column to NaN:
df.x = df.x.where(df.x.gt(0))
#alternative
#df.x = df.x.mask(df.x.le(0))
print (df)
x y
0 1.0 1
1 2.0 1
2 3.0 1
3 NaN 1
df1 = df.groupby("y").agg(x_sum=("x",np.mean))
print (df1)
x_sum
y
1 2.0
You can write and use your custom function. Example:
import pandas as pd
import numpy as np
def mean_without_zero_values(values):
vals = [v for v in values if v > 0]
return np.mean(vals)
df=pd.DataFrame({"x":[1,2,3,0],"y":[1,1,1,1]})
result = df.groupby("y").agg(x_sum=("x",mean_without_zero_values))
print(result)
# output
# x_sum
# y
# 1 2

Printing fibonacci series using lists in python

I am relatively a noobie in programming and trying to learn python. I was trying to implement a Fibonacci series into a list within 10.
fibo= [0,1]
for k in range(11):
i= fibo[-1]
j = fibo[-2]
k= fibo[i]+fibo[j]
fibo.append(k)
k=+1
print(fibo)
Not sure what I did wrong? Any help is really appreciated!
Output:
[0, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
You can use this code to print fibonacci series upto N.
N = int(input()) # Length of fibonacci series
fibo = [0, 1]
a = fibo[0]
b = fibo[1]
for each in range(2, N):
c = a + b
a, b = b, c
fibo.append(c)
print(fibo[:N])
OUTPUT
N = 10
[0, 1, 1, 2, 3, 5, 8, 13, 21, 34]
I can see few issues here:
fibo = [0,1]
# Your loop should start at index 2, otherwise
# you will try to access fibo[-2] and fibo[-1] with k=0
# and fibo[-1] and fibo[0] with k=1
# Also, if you only want 10 values, you should change 11
# to 10 since fibo already has two values.
# for k in range(11):
for k in range(2, 10):
# You don't want the content of fibo, but the indexes!
# i = fibo[-1]
# j = fibo[-2]
i = k - 1
j = k - 2
# You are already using k, so you need to use an other variable
# k = fibo[i] + fibo[j]
v = fibo[i] + fibo[j]
fibo.append(v)
# You don't need k+=1 because it will be incremented
# according to range(2, 10). The loop will start with
# k = 2 and it will stop when k = 9
# k += 1
print(fibo)
Your code did not crash because you technically can access fibo[-1] and fibo[-2]. It will respectively return the last value of your array, and the value before the last one.

How can we multiply the elements of a list by increasing the index?

b = [3,2,6]
hid = []
#print(b[0] * b[1])
#print(b[1] * b[2])
#print(b[2] * b[3])
for n in range(len(b)):
print(b[n-1] * b[n])
The result I am expecting is
6, 12
But I am getting
18, 6, 12
Where is the mistake?
range(len(b)) which evaluate as (0,1,2)
so b[n-1] is b[-1] which is 6 so (6 * 3) =18 it give result
18, 6, 12
try This one:
for n in range(len(b)-1):
print(b[n] * b[n+1])
Your list has three elements only, so there's no b[3], what you're trying to achieve (multiplying combinations of size 2) can be done using two nested loops:
b = [3, 2, 6]
for i in range(len(b) - 1):
for j in range(i + 1, len(b)):
print(b[i] * b[j])
Output:
6
18
12
However, a more Pythonic way would be to use itertools.combinations:
b = [3, 2, 6]
from itertools import combinations
for x, y in combinations(b, 2):
print(x * y)
Output:
6
18
12
for n in range(len(b)):
print(b[n-1] * b[n])
in first iteration when n =0 you have:
print(b[n-1] * b[n])
what means:
print(b[0-1] * b[0])
b[0-1] will gave you last element in the list, because of that you have
6*3 = 18
you should do it:
b = [3,2,6]
hid = []
for n in range(1,len(b)):
print(b[n-1] * b[n])
range(3) will generate numbers from 0 to 2 (3 numbers). You can also
define the start, stop and step size as range(start,stop,step size).
step size defaults to 1 if not provided.
or
b = [3,2,6]
hid = []
for n in range(len(b)-1):
print(b[n] * b[n+1])
output:
6
12
In a for loop, the iteration starts at zero
for n in range(len(b)):
print(b[n-1] * b[n]) doing "n-1" means b[-1] which is last element of your list "6" and this with b[n] which b[0] is 18.
So, your code translates to :
print b[-1]*b[0]
print b[0]*b[1]
print b[1]*b[2]
Correct way to code this would be :
#!/usr/bin/python
b = [3,2,6]
i = 0
while i < len(b)-1:
print b[i]*b[i+1]
i+=1
Output:
6
12

Iterate over list in random sequential multiples of n Python

I have the following example code:
l = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
def batch_gen(data, batch_size):
for i in range(0, len(data), batch_size):
yield data[i:i+batch_size]
batch_size = randint(1,4)
n = 0
print("batch_size is {}".format(batch_size))
for i in batch_gen(l, batch_size):
for x in i:
print(x)
print("loop {}".format(n))
n += 1
batch_size = randint(1,4)
Which gives me the output of:
batch_size is 3
1
2
3
loop 0
4
5
6
loop 1
7
8
9
loop 2
10
loop 3
This is the output I'm looking for however batch_size is always set from batch_size = randint(1,4) outside of the for loop.
I'm looking at having a random batch_size for each iteration of the loop, and each next loop picking up where the last loop left off until completion of a random length list.
Any help would be greatly appreciated!
Example code taken from Iterate over a python sequence in multiples of n?

Resources