Use Kalman Filter to estimate position - position

I try to use Kalman filter in order to estimate the position. The input in the system is the velocity and this is also what I measure. The velocity is not stable, the system movement is like a cosine in general. So the equation is:
xnew = Ax + Bu + w, where:
x= [x y]'
A = [1 0; 0 1]
B= [dt 0; 0 dt]
u=[ux uy]
w noise
As I mentioned, what I measure is the velocity. My question is how would the matrix C look like in the equation:
y= Cx + v
Should I involve the velocity in the estimated states (matrix A)? Or should I change the equations to involve also the acceleration? I can't measure the acceleration.

One way would be to drop the velocities as inputs and put them in your state. This way, your state is both the position and velocity and your filter uses as observation both the measured speed of your vehicle and a noisy estimate of your position.
With this system your problem becomes:
x = [x_e y_e vx_e vy_e]'
A = [1 0 dt 0; 0 1 0 dt; 0 0 1 0; 0 0 0 1]
w noise
with x_e, y_e, vx_e, and vy_e the estimated values of the state
B is removed because u is 0. And then you have
y = Cx + v
with C = [1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1]
With y = [x + dt*vx ; y + dt*vy ; vx ; vy] and x, y, vx, and vy the measured values of the velocities and x and y the position calculated with the measured velocities.
It is very similar to the example you will find here on Wikipedia

Related

Get a certain combination of numbers in Python

Is there a efficient and convenient solution in Python to do something like -
Find largest combination of two numbers x and y, with the following conditions -
0 < x < 1000
0 < y < 2000
x/y = 0.75
x & y are integers
It's easy to do it using a simple graphing calculator but trying to find the best way to do it in Python
import pulp
My_optimization_prob = pulp.LpProblem('My_Optimization_Problem', pulp.LpMaximize)
# Creating the variables
x = pulp.LpVariable("x", lowBound = 1, cat='Integer')
y = pulp.LpVariable("y", lowBound = 1, cat='Integer')
# Adding the Constraints
My_optimization_prob += x + y #Maximize X and Y
My_optimization_prob += x <= 999 # x < 1000
My_optimization_prob += y <= 1999 # y < 2000
My_optimization_prob += x - 0.75*y == 0 # x/y = 0.75
#Printing the Problem and Constraints
print(My_optimization_prob)
My_optimization_prob.solve()
#printing X Y
print('x = ',pulp.value(x))
print('y = ',pulp.value(y))
Probably just -
z = [(x, y) for x in range(1, 1000) for y in range(1, 2000) if x/y==0.75]
z.sort(key=lambda x: sum(x), reverse=True)
z[0]
#Returns (999, 1332)
This is convenient, not sure if this is the most efficient way.
Another possible relatively efficient solution is -
x_upper_limit = 1000
y_upper_limit = 2000
x = 0
y = 0
temp_variable = 0
ratio = 0.75
for i in range(x_upper_limit, 0, -1):
temp_variable = i/ratio
if temp_variable.is_integer() and temp_variable < y_upper_limit:
x = i
y = int(temp_variable)
break
print(x,y)

Complex indexing without loop

I have an array that has Nx5 dimensions. The last 3 dimensions are the x, y, and z of different volumes. These are then packed into the other two dimensions. For example:
[[0 0 x0 y0 z0],
[0 0 x1 y1 z1],
[0 0 x2 y2 z2],
[0 1 x3 y3 z3],
[0 1 x4 y4 z4],
[1 0 x5 y5 z5],
[1 0 x6 y6 z6],
[1 1 x7 y7 z7],
[1 1 x8 y8 z8],
[2 0 x9 y9 z9],
[2 0 x10 y10 z10],
[2 0 x11 y11 z11],
[2 1 x12 y12 z12]]
The number of volumes for each of the first two dimensions varies every time. I wanto to calculate the mean of x, y, and z for each volume in each dimension. This should result in something like this:
[[0 0 xmean0 ymean0 zmean0],
[0 1 xmean1 ymean1 zmean1],
[1 0 xmean2 ymean2 zmean2],
[1 1 xmean3 ymean3 zmean3]]
[2 0 xmean4 ymean4 zmean4],
[2 1 xmean5 ymean5 zmean5]]
In other words, it should have the mean for each combination of the first to elements. I cannot use loops for this, only numpy and/or tensorflow.
We will assume the input array is a.
Approach #1 : With bincount -
unq_comb,ids, w = np.unique(a[:,:2], axis=0, return_inverse=1, return_counts=1)
out = np.empty((len(unq_comb),5))
out[:,:2] = unq_comb
for i in [2,3,4]:
out[:,i] = np.bincount(ids, a[:,i])/w
Approach #2 : With sorting -
sidx = np.lexsort(a[:,:2].T)
b = a[sidx]
idx = np.flatnonzero(np.r_[True,(b[:-1,:2] != b[1:,:2]).any(1),True])
w = np.diff(idx)[:,None].astype(float)
out = np.empty((len(unq_comb),5))
out[:,:2] = b[idx[:-1],:2]
out[:,2:] = np.add.reduceat(b[:,2:], idx[:-1], axis=0)/w

Finding relative position in plus shape

I am making a dithering library. To find the relative position of an absolute point a in a 2-dimensional plane tiled with 4 unit squares, I use rel.x = abs.x % 4; rel.y = abs.y % 4. This is good, and produces the expected results. But what if I am tiling the plane with plus shapes, which are 3 units? How do I find the absolute position? The tile shape is showed here, 1's are parts of the shape, and 0's are empty areas.
0 1 0
1 1 1
0 1 0
For example, if I have point a resting on x = 1, y = 1, then the absolute position should be x = 1, y = 1. But if it is on, say x = 4, y = 1, then the absolute position should be x = 1, y = 2. You see, there would be another plus which's bottom is on the point x = 1, y = 2. How is this accomplished mathematically? Any language, pseudo code is great too. :)
There is periodicity along X and Y axes with period 5. So long switch expression might look like:
case y % 5 of:
0: case x % 5 of
0: cx = x - 1; cy = y;
1: cx = x; cy = y + 1;
2: cx = x; cy = y - 1;
3: cx = x + 1; cy = y;
4: cx = x; cy = y;
1:...
Or we can create constant array 5x5 and fill it with shifts -1, 0, 1.
dx: [[-1,0,0,1,0],[1,0,-1,0,0],[0,0,1,0,-1],[0,-1,0,0,1],[0,1,0,-1,0]]
dy: [[0,1,-1,0,0],[0,0,0,1,-1],[1,-1,0,0,0],[0,0,1,-1,0],[-1,0,0,0,1]]
I feel that some simple formula might exist.
Edit: simpler version:
const dx0: [-1,0,0,1,0]
const dy0: [0,1,-1,0,0]
ixy = (x - 2 * y + 10) % 5;
dx = dx0[ixy];
dy = dy0[ixy];
And finally crazy one-liners without constant arrays
dx = (((11 + x - 2 * (y%5)) % 5) ^ 1 - 2) / 2 //^=xor; /2 - integer division
dy = ((13 + x - 2 * (y%5)) % 5 - 2) / 2

How term frequency is calculated in TfidfVectorizer?

I searched a lot for understanding this but I am not able to. I understand that by default TfidfVectorizer will apply l2 normalization on term frequency. This article explain the equation of it. I am using TfidfVectorizer on my text written in Gujarati language. Following is details of output about it:
My two documents are:
ખુબ વખાણ કરે છે
ખુબ વધારે છે
The code I am using is:
vectorizer = TfidfVectorizer(tokenizer=tokenize_words, sublinear_tf=True, use_idf=True, smooth_idf=False)
Here, tokenize_words is my function for tokenizing words.
The list of TF-IDF of my data is:
[[ 0.6088451 0.35959372 0.35959372 0.6088451 0. ]
[ 0. 0.45329466 0.45329466 0. 0.76749457]]
The list of features:
['કરે', 'ખુબ', 'છે.', 'વખાણ', 'વધારે']
The value of idf:
{'વખાણ': 1.6931471805599454, 'છે.': 1.0, 'કરે': 1.6931471805599454, 'વધારે': 1.6931471805599454, 'ખુબ': 1.0}
Please explain me in this example what shall be the term frequency of each term in my both documents.
Ok, Now lets go through the documentation I gave in comments step by step:
Documents:
`ખુબ વખાણ કરે છે
ખુબ વધારે છે`
Get all unique terms (features): ['કરે', 'ખુબ', 'છે.', 'વખાણ', 'વધારે']
Calculate frequency of each term in documents:-
a. Each term present in document1 [ખુબ વખાણ કરે છે] is present once, and વધારે is not present.`
b. So the term frequency vector (sorted according to features): [1 1 1 1 0]
c. Applying steps a and b on document2, we get [0 1 1 0 1]
d. So our final term-frequency vector is [[1 1 1 1 0], [0 1 1 0 1]]
Note: This is the term frequency you want
Now find IDF (This is based on features, not on document basis):
idf(term) = log(number of documents/number of documents with this term) + 1
1 is added to the idf value to prevent zero divisions. It is governed by "smooth_idf" parameter which is True by default.
idf('કરે') = log(2/1)+1 = 0.69314.. + 1 = 1.69314..
idf('ખુબ') = log(2/2)+1 = 0 + 1 = 1
idf('છે.') = log(2/2)+1 = 0 + 1 = 1
idf('વખાણ') = log(2/1)+1 = 0.69314.. + 1 = 1.69314..
idf('વધારે') = log(2/1)+1 = 0.69314.. + 1 = 1.69314..
Note: This corresponds to the data you showed in question.
Now calculate TF-IDF (This again is calculated document-wise, calculated according to sorting of features):
a. For document1:
For 'કરે', tf-idf = tf(કરે) x idf(કરે) = 1 x 1.69314 = 1.69314
For 'ખુબ', tf-idf = tf(કરે) x idf(કરે) = 1 x 1 = 1
For 'છે.', tf-idf = tf(કરે) x idf(કરે) = 1 x 1 = 1
For 'વખાણ', tf-idf = tf(કરે) x idf(કરે) = 1 x 1.69314 = 1.69314
For 'વધારે', tf-idf = tf(કરે) x idf(કરે) = 0 x 1.69314 = 0
So for document1, the final tf-idf vector is [1.69314 1 1 1.69314 0]
b. Now normalization is done (l2 Euclidean):
dividor = sqrt(sqr(1.69314)+sqr(1)+sqr(1)+sqr(1.69314)+sqr(0))
= sqrt(2.8667230596 + 1 + 1 + 2.8667230596 + 0)
= sqrt(7.7334461192)
= 2.7809074272977876...
Dividing each element of the tf-idf array with dividor, we get:
[0.6088445 0.3595948 0.3595948548 0.6088445 0]
Note: This is the tfidf of firt document you posted in question.
c. Now do the same steps a and b for document 2, we get:
[ 0. 0.453294 0.453294 0. 0.767494]
Update: About sublinear_tf = True OR False
Your original term frequency vector is [[1 1 1 1 0], [0 1 1 0 1]] and you are correct in your understanding that using sublinear_tf = True will change the term frequency vector.
new_tf = 1 + log(tf)
Now the above line will only work on non zero elements in the term-frequecny. Because for 0, log(0) is undefined.
And all your non-zero entries are 1. log(1) is 0 and 1 + log(1) = 1 + 0 = 1`.
You see that the values will remain unchanged for elements with value 1. So your new_tf = [[1 1 1 1 0], [0 1 1 0 1]] = tf(original).
Your term frequency is changing due to the sublinear_tf but it still remains the same.
And hence all below calculations will be same and output is same if you use sublinear_tf=True OR sublinear_tf=False.
Now if you change your documents for which the term-frequecy vector contains elements other than 1 and 0, you will get differences using the sublinear_tf.
Hope your doubts are cleared now.

How to draw normal vectors to an ellipse

How do I draw an ellipse with lines of the same length coming out of it?
It's easy to do with a circle, I can just write something like
for (u = 0 ; u < 2*pi ; u += 0.001*pi) {
drawdot (cos(u), sin(u)) ;
drawline (cos(u), sin(u), 2*cos(u), 2*sin(u) ;
}
But if I did that for an ellipse, like below, the lines are different lengths.
for (u = 0 ; u < 2*pi ; u += 0.001*pi) {
drawdot (2*cos(u), sin(u)) ;
drawline (2*cos(u), sin(u), 4*cos(u), 2*sin(u) ;
}
How do I figure out how to make them the same length?
There are a few ways of thinking about this.
You can think of an ellipse as a circle that's been stretched in some direction. In this case, you've taken the circle x^2 + y^2 = 1 and applied the transformation to all points on that curve:
x' = 2x
y' = y
You can think of this as multiplying by the matrix:
[ 2 0 ]
[ 0 1 ]
To transform normals, you need to apply the inverse transpose of this matrix (i.e. the inverse of the transpose, or transpose of the inverse; it's the same thing):
[ 1/2 0 ]
[ 0 1 ]
(This, by the way, is known as the dual of the previous transformation. This is a very important operation in modern geometry.)
A normal to the circle at the point (x,y) goes in the direction (x,y). So a normal to the ellipse at the point (2x,y) goes in the direction (0.5*x,y). This suggests:
for (u = 0 ; u < 2*pi ; u += 0.001*pi) {
x = cos(u); y = sin(u);
drawdot (2*x, y) ;
drawline (2*x, y, 2*x + 0.5*x, y+y);
}
Or if you need a unit normal:
for (u = 0 ; u < 2*pi ; u += 0.001*pi) {
x = cos(u); y = sin(u);
drawdot (2*x, y) ;
dx = 0.5*x;
dy = y;
invm = 1 / sqrt(dx*dx + dy*dy);
drawline (2*x, y, 2*x + dx * invm, y + dy * invm);
}
Another way to think about it is in terms of an implicit contour. If you define the curve by a function:
f(x,y) = 0
then the normal vector points in the direction:
(df/dx, df/dy)
where the derivatives are partial derivatives. In your case:
f(x,y) = (x/2)^2 + y^2 = 0
df/dx = x/2
df/dy = y
which, you will note, is the same as the dual transformation.

Resources