I have an array that has Nx5 dimensions. The last 3 dimensions are the x, y, and z of different volumes. These are then packed into the other two dimensions. For example:
[[0 0 x0 y0 z0],
[0 0 x1 y1 z1],
[0 0 x2 y2 z2],
[0 1 x3 y3 z3],
[0 1 x4 y4 z4],
[1 0 x5 y5 z5],
[1 0 x6 y6 z6],
[1 1 x7 y7 z7],
[1 1 x8 y8 z8],
[2 0 x9 y9 z9],
[2 0 x10 y10 z10],
[2 0 x11 y11 z11],
[2 1 x12 y12 z12]]
The number of volumes for each of the first two dimensions varies every time. I wanto to calculate the mean of x, y, and z for each volume in each dimension. This should result in something like this:
[[0 0 xmean0 ymean0 zmean0],
[0 1 xmean1 ymean1 zmean1],
[1 0 xmean2 ymean2 zmean2],
[1 1 xmean3 ymean3 zmean3]]
[2 0 xmean4 ymean4 zmean4],
[2 1 xmean5 ymean5 zmean5]]
In other words, it should have the mean for each combination of the first to elements. I cannot use loops for this, only numpy and/or tensorflow.
We will assume the input array is a.
Approach #1 : With bincount -
unq_comb,ids, w = np.unique(a[:,:2], axis=0, return_inverse=1, return_counts=1)
out = np.empty((len(unq_comb),5))
out[:,:2] = unq_comb
for i in [2,3,4]:
out[:,i] = np.bincount(ids, a[:,i])/w
Approach #2 : With sorting -
sidx = np.lexsort(a[:,:2].T)
b = a[sidx]
idx = np.flatnonzero(np.r_[True,(b[:-1,:2] != b[1:,:2]).any(1),True])
w = np.diff(idx)[:,None].astype(float)
out = np.empty((len(unq_comb),5))
out[:,:2] = b[idx[:-1],:2]
out[:,2:] = np.add.reduceat(b[:,2:], idx[:-1], axis=0)/w
Related
I try to use Kalman filter in order to estimate the position. The input in the system is the velocity and this is also what I measure. The velocity is not stable, the system movement is like a cosine in general. So the equation is:
xnew = Ax + Bu + w, where:
x= [x y]'
A = [1 0; 0 1]
B= [dt 0; 0 dt]
u=[ux uy]
w noise
As I mentioned, what I measure is the velocity. My question is how would the matrix C look like in the equation:
y= Cx + v
Should I involve the velocity in the estimated states (matrix A)? Or should I change the equations to involve also the acceleration? I can't measure the acceleration.
One way would be to drop the velocities as inputs and put them in your state. This way, your state is both the position and velocity and your filter uses as observation both the measured speed of your vehicle and a noisy estimate of your position.
With this system your problem becomes:
x = [x_e y_e vx_e vy_e]'
A = [1 0 dt 0; 0 1 0 dt; 0 0 1 0; 0 0 0 1]
w noise
with x_e, y_e, vx_e, and vy_e the estimated values of the state
B is removed because u is 0. And then you have
y = Cx + v
with C = [1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1]
With y = [x + dt*vx ; y + dt*vy ; vx ; vy] and x, y, vx, and vy the measured values of the velocities and x and y the position calculated with the measured velocities.
It is very similar to the example you will find here on Wikipedia
In a dataframe I have 4 variables that are the X, Y, Z and W orientations of a robot. Each line represents a measurement with these four values.
x = [-0.75853, -0.75853, -0.75853, -0.75852]
y = [-0.63435, -0.63434, -0.63435, -0.63436]
z = [-0.10488, -0.10490, -0.10492, -0.10495]
w = [-0.10597, -0.10597, -0.10597, -0.10596]
df = pd.DataFrame([x, y, z, w], columns=['x', 'y', 'z', 'w'])
I wrote the function below that returns three differences between two quaternions:
from pyquaternion import Quaternion
def quaternion_distances(w1, x1, y1, z1, w2, x2, y2, z2):
""" Create two Quaternions objects and calculate 3 distances between them """
q1 = Quaternion(w1, x1, y1, z1)
q2 = Quaternion(w2, x2, y2, z2)
dist_by_signal = Quaternion.absolute_distance(q1, q2)
dist_geodesic = Quaternion.distance(q1, q2)
dist_sim_geodec = Quaternion.sym_distance(q1, q2)
return dist_by_signal, dist_geodesic, dist_sim_geodec
This difference is calculated based on the values of the second line by the values of the first line. Thus, I cannot use the Pandas apply function.
I have already added three columns to the dataframe, so that I receive each of the values returned by the function:
df['dist_by_signal'] = 0
df['dist_geodesic'] = 0
df['dist_sim_geodec'] = 0
The problem is: how to apply the above function to each row and include the result in these new columns? Can you give me a suggestion?
Consider shift to create adjacent columns, w2, x2, y2, z2, of next row values then run rowwise apply which does require axis='columns' (not index):
df[[col+'2' for col in list('wxyz')]] = df[['x', 'y', 'z', 'w']].shift(-1)
def quaternion_distances(row):
""" Create two Quaternions objects and calculate 3 distances between them """
q1 = Quaternion(row['w'], row['x'], row['y'], row['z'])
q2 = Quaternion(row['w2'], row['x2'], row['y2'], row['z2'])
row['dist_by_signal'] = Quaternion.absolute_distance(q1, q2)
row['dist_geodesic'] = Quaternion.distance(q1, q2)
row['dist_sim_geodec'] = Quaternion.sym_distance(q1, q2)
return row
df = df.apply(quaternion_distances, axis='columns')
print(df)
You can use.
Quaternions=df.apply(lambda x: Quaternion(x), axis=1)
df['dist_by_signal'] = 0
df['dist_geodesic'] = 0
df['dist_sim_geodec'] = 0
df.reset_index(drop=True)
for i in df.index:
q1=Quaternions[i]
if i+1<len(df.index):
q2=Quaternions[i+1]
df.loc[i,['dist_by_signal','dist_geodesic','dist_sim_geodec']]=[Quaternion.absolute_distance(q1, q2), Quaternion.distance(q1, q2),Quaternion.sym_distance(q1, q2)]
print(df)
x y z w dist_by_signal dist_geodesic \
0 -0.75853 -0.75853 -0.75853 -0.75852 0.248355 0.178778
1 -0.63435 -0.63434 -0.63435 -0.63436 1.058875 1.799474
2 -0.10488 -0.10490 -0.10492 -0.10495 0.002111 0.010010
3 -0.10597 -0.10597 -0.10597 -0.10596 0.000000 0.000000
dist_sim_geodec
0 0.178778
1 1.799474
2 0.010010
3 0.000000
I have x2, x3, y2, y3, d1, d2, d3 values which is,
x2 = 0
x3 = 100
y2 = 0
y3 = 0
d1 = 100
d2 = 100
d3 = 87
When I use the below script,
from sympy import symbols, Eq, solve
x, y = symbols('x y')
eq1 = Eq((x - x2) ** 2 + (y - y2) ** 2 - d2 ** 2)
eq2 = Eq((x - x3) ** 2 + (y - y3) ** 2 - d3 ** 2)
sol_dict = solve((eq1, eq2), (x, y))
I got the ans as,
sol_dict = [(12431/200, -87*sqrt(32431)/200), (12431/200, 87*sqrt(32431)/200)]
How can I achieve the simplified solution like
sol_dict = [(62.155, -78.33), (62.155, 78.33)]
in python?
You can numerically evaluate the solution to get floats:
In [40]: [[x.evalf(3) for x in s] for s in sol_dict]
Out[40]: [[62.2, -78.3], [62.2, 78.3]]
I would only recommend doing that for display though. If you want to use the values in sol_dict for further calculations it's best to keep them as exact rational numbers.
How to add aggregated error to keras model?
Having table:
g x y
0 1 1 1
1 1 2 2
2 1 3 3
3 2 1 2
4 2 2 1
I would like to be able to minimize sum((y - y_pred) ** 2) error along with
sum((sum(y) - sum(y_pred)) ** 2) per group.
I'm fine to have bigger individual sample errors, but it is crucial for me to have right totals.
SciPy example:
import pandas as pd
from scipy.optimize import differential_evolution
df = pd.DataFrame({'g': [1, 1, 1, 2, 2], 'x': [1, 2, 3, 1, 2], 'y': [1, 2, 3, 2, 1]})
g = df.groupby('g')
def linear(pars, fit=False):
a, b = pars
df['y_pred'] = a + b * df['x']
if fit:
sample_errors = sum((df['y'] - df['y_pred']) ** 2)
group_errors = sum((g['y'].sum() - g['y_pred'].sum()) ** 2)
total_error = sum(df['y'] - df['y_pred']) ** 2
return sample_errors + group_errors + total_error
else:
return df['y_pred']
pars = differential_evolution(linear, [[0, 10]] * 2, args=[('fit', True)])['x']
print('SAMPLES:\n', df, '\nGROUPS:\n', g.sum(), '\nTOTALS:\n', df.sum())
Output:
SAMPLES:
g x y y_pred
0 1 1 1 1.232
1 1 2 2 1.947
2 1 3 3 2.662
3 2 1 2 1.232
4 2 2 1 1.947
GROUPS:
x y y_pred
g
1 6 6 5.841
2 3 3 3.179
TOTALS:
g 7.000
x 9.000
y 9.000
y_pred 9.020
For grouping, as long as you keep the same groups throughout training, your loss function will not have problems about being not differentiable.
As a naive form of grouping, you can simply separate the batches.
I suggest a generator for that.
#suppose you have these three numpy arrays:
gTrain
xTrain
yTrain
#create this generator
def grouper(g,x,y):
while True:
for gr in range(1,g.max()+1):
indices = g == gr
yield (x[indices],y[indices])
For the loss function, you can make your own:
import keras.backend as K
def customLoss(yTrue,yPred):
return K.sum(K.square(yTrue-yPred)) + K.sum(K.sum(yTrue) - K.sum(yPred))
model.compile(loss=customLoss, ....)
Just be careful with the second term if you have negative values.
Now you train using the method fit_generator:
model.fit_generator(grouper(gTrain,xTrain, yTrain), steps_per_epoch=gTrain.max(), epochs=...)
firstly, apologize for little cryptic title to my question. Let me try to explain my need:-
I am reading two features namely X1, X2 from a CSV file. I have a training set of data in a csv file containing 1000 records with each line corresponding to the value of X1, X2. To make my training set fit better to my machine learning code, I want to do feature mapping that would take X1, X2 and create polynomial terms to the power of 4. for example if X1 =a, X2=b, I want to add newer features a^2, a*b, b^2, a^3,a^2*b,a*b^2,a^4...and so on.
Now if I read them as a numpy matrix , I want to see the data like this:
[ [ 1 a b a^2 a*b, b^2 a^3 a^2*b......]
[.... ............ ............ ]
[ ..
..] ]
Note that the number of rows are fixed , but the number of columns are determined by the degree selected. Also first three columns need to be
[[1 a b ..]
[1 c d ..]
..
..]
The pseudo code I am thinking of is as follows:-
def poly(X): # where X is a numpy matrix with X1, X2 columns,
degree = 4;
r= X.shape[0]
c=1 # number of columns
val_matrix= np.ones(shape=(r,c)) # creating a (r,1) matrix init with 1s
# *start of psuedo code*
while i<=degree:
while j <=i:
val_matrix[:, c+1] = (X1.^(i-j)).*(X2.^j)
I am not sure how to get this working in python?. would appreciate some suggestion. Note that ^ refers to the power of.
Starting with two vectors X1 and X2 you could create the monomials:
X1p = X1[:, None]**np.arange(max_deg + 1)
X2p = X2[:, None]**np.arange(max_deg + 1)
and then combine them using mgrid
i, j = np.mgrid[:max_deg + 1,:max_deg + 1]
m = i+j <= max_deg
result = X1p[:, i[m]]*X2p[:, j[m]]
Alternatively you could apply the indices directly to X1 and X2:
result = X1[:, None]**i[m] * X2[:, None]**j[m]
This requires fewer lines of code but uses more multiplications.
If the number of multiplications is a concern, X1p and X2p could also be computed cheaper; X1p:
X1p = np.empty((len(X1), max_deg + 1), X1.dtype)
X1p[:, 0] = 1
X1p[:, 1:] = X1[:, None]
np.multiply.accumulate(X1p[:,1:], axis=-1, out=X1p[:, 1:])
and similar for X2p