Im looking to piggyback a float value on a 4x4 matrix im transfering. The matrix in question is used for various xyz vector transforms. As far as i understand the first 3x3 of the matrix facilitates the rotation and scale transforms with the 4th value of the first 3 rows offset the pivot of the transform and the first 3 elements of the bottom row do the positional offset but what does the last element do? As far as i've seen it is always 1 and does absolutely nothing, is there any harm in me making a good use of it?
If (as your question suggests) the convention is that points are column vectors, and the last elements of the first 3 rows determine the translation, then the first 3 elements of the bottom row are not a positional offset: the 4th row of a transformation matrix is used for perspective projection, which is how the camera maps 3D points onto the 2D viewport. A sketch of how the 4x4 matrix is used to map points:
result = 4x4 matrix * point
[ x' ] [ Rxx Rxy Rxz Tx ] [ x ]
[ ] [ ] [ ] -> R** is the rotation/scaling matrix
[ y' ] [ Ryx Ryy Ryz Ty ] [ y ]
[ ] = [ ] * [ ] T* is the translation vector
[ z' ] [ Rzx Rzy Rzz Tz ] [ z ]
[ ] [ ] [ ] P* fixes the camera projection plane
[ w' ] [ Px Py Pz Pw ] [ w ]
-> x' = dot_product3([Rxx,Rxy,Rxz], [x,y,z]) + Tx * w
-> y' = dot_product3([Ryx,Ryy,Ryz], [x,y,z]) + Ty * w
-> z' = dot_product3([Rzx,Rzy,Rzz], [x,y,z]) + Tz * w
-> w' = dot_product4([Px,Py,Pz,Pw], [x,y,z,w])
The final step in the 3D point -> 2D screen mapping is to divide by the 4th (w') coordinate -- so, for standard geometric transformations, the last row should generally be [0,0,0,1], which leaves the w coordinate unchanged.
If the first 3 elements of the bottom row are not 0, the matrix will typically generate a weird, nonuniform distortion of the 3D space. And, although it is possible to use the last element as a uniform inverse scaling factor, it is probably better to do your scaling with the upper left 3x3 submatrix, and leave the bottom row exclusively for the camera.
Related
I have two 4x4 affine matrix, A and B. They represent the pose of two objects in the world coordinate system.
How could I calculate their relative pose via matrix multiplication ? (Actually, I want to know the position(x_A,y_A) in the coordinate system of object B)
I've tried with relative pose = A * B^-1
relative_pose = torch.multiply(A, torch.inverse(B)).
However, the relative translation is way too big. (A and B are pretty close to each other, while they are far away from origin point in world coordinate.)
test data for pytorch:
import torch
A = torch.tensor([[-9.3793e-01, -3.4481e-01, -3.7340e-02, -4.6983e+03],
[ 3.4241e-01, -9.3773e-01, 5.8526e-02, 1.0980e+04],
[-5.5195e-02, 4.2108e-02, 9.9759e-01, -2.3445e+01],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00]])
B = torch.tensor([[-9.7592e-01, -2.1022e-01, -5.8136e-02, -4.6956e+03],
[ 2.0836e-01, -9.7737e-01, 3.6429e-02, 1.0979e+04],
[-6.4478e-02, 2.3438e-02, 9.9764e-01, -2.3251e+01],
[ 0.0000e+00, 0.0000e+00, 0.0000e+00, 1.0000e+00]])
So I assume you are using solid transformation matrices M in homogeneous coordinates, in other words 4x4 matrices containing a 3x3 rotation matrix R, a 3x1 translation vector T and a [0,0,0,1] homogeneous "padding" row vector. And you want to find the transformation to go from one pose to the other (I don't know how to write matrices by block, but that would be something like (R | T \\ 0 | 1)
Then I think your formula is wrong : if Y_1 = M_1 X and Y_2 = M_2 X, then you have Y_2 = M_2 M_1^-1 X, and your relative pose matrix is M_rel = M_2 M_1^-1
So you need to invert your solid transformation matrix M_1 = (R_1 | T_1 \\ 0 | 1)
If you write the equations, and if we note P = R_1^-1, then you'll find that M_1^-1 = (P | -PT \\ 0 | 1)
The math solution of #trialNerror is totally right. Here is a well structed answer about calculating inverse of affine matrix
I made a mistake in pytorch. The
torch.multiply offers element-wise multiplication. For multipying matrices, man shoud use torch.mm().
In my case with batch as extra dimension, the code should be like this
relative_pose = torch.inverse(A).bmm(B)
I am using turfjs and leaflet to plot a grid and label each square in this fashion:
[A0,A1,...,A23]
[B0,B1,...,B23]
[C0,C1,...,C23]
End goal:
To know what are the coordinates of the corner points of each cell. I mean, I want to know what are the coordinates of the 4 corners of A0 ( and the other cells ).This will then be fed to a json file with something like this:
[
{"A0": [
["x","y"],
["x","y"],
["x","y"],
["x","y"]
]},
{"A1": [
["x","y"],
["x","y"],
["x","y"],
["x","y"]
]}
]
Then, my app will ask the GPS from the device and learn which "square" i'm in.
I have managed to plot the squares ( fiddle, but could not label them or even summon a click to console to find out what are the corner coordinates. I have console'd out the layers but i'm not sure if the plot of the geoJson layer is plotted from left to right. I have concluded each layer spits out 5 coordinates which I suspect that is the information I require but there is a 5th coordinate which does not make sense to be in a square grid cell, unless the 3rd coordinate is the center...
I was able to figure out the mystery of the GeoJson layer in leaflet.
The coordinates are returned like this:
[ 0, 3 , 6 ]
[ 1, 4 , 7 ]
[ 2, 5 , 8 ]
//will label this way:
A0 = 0 ( coordinate sets at 0 )
A1 = 1 ( coordinate sets at 1 )
A2 = 2
B0 = 3
B1 = 4
B2 = 5
...
I still don't know why there is a 5th coordinate in each layer plotted by leaflet. but this is good enough for me. I can know label them as I want.
Thank you for the help.
I start with one coordinate system and a point along the Z axis given by P1 = [0 0 h]
and a point in the XY plane given by P2 = [h*tan(A), h*tan(B), 0]
I then solve for the equation of a plane perpendicular to a vector which points from P2 to P1 so, Vector = P1 - P2. The Plane equation I get is the following:
X*h*tan(A)-Y*h*tan(B)+Z*h = 0.
Now I am given four points relative to this plane, from which the origin of this new plane is the same as P2 in the original plane.
The four points make a rectangle and are:
[L*tan(C), L*tan(D), 0]
[L*tan(C), -L*tan(D), 0]
[-L*tan(C), -L*tan(D), 0]
[-L*tan(C), L*tan(D), 0]
How exactly would I go about converting these points into the original coordinate system?
I know that there's a translation and a rotation involved, and when I goggled my problem I could only find cases of translation and rotation separate, and none which were a combination of the two.
How exactly does the rotational transformation work? What if i already know it rotates about the X axis by A degrees and about the Y axis by B degrees? Can I then just do simple trig in that case to back track the value? Or is it not that simple and I have to rotate it about the axis it has to rotate in order to match back to the original coordinate system?
Is there a function in MATLAB to which i can insert say 3 points which define one coordinate system and another 3 points for a second coordinate system that would then give me the transformation matrices?
Please let me know if my wording is unclear, this 3-D problem is so very hard to visualize I can't seem to figure out how to write all the trig for it and wanted to try it mathematically... but If you by chance know a more simple solution which mgiht be more straight forward please suggest it!
Answering the pared down version of the question (see the comments):
A point expressed in a coordinate system given by an origin point O1, with axis vectors X1, Y1 and Z1 has coordinates P=(x1, y1, z1) in that coordinate system. Similarly, in a second coordinate with origin O2 and axis vectors X2, Y2 and Z2, the same point is expressed P=(x2, y2, z2). (Note the lower case for coordinates, upper case for points and vectors).
What this actually means is:
P = O1 + x1 X1 + y1 Y1 + z1 Z1 and
P = O2 + x2 X2 + y2 Y2 + z2 Z2
Setting these equal to each other, and writing them in matrix form:
[ O11 ] [ X11 X12 X13 ][ x1 ] [ O21 ] [ X21 X22 X23 ][ x2 ]
[ O12 ] + [ Y11 Y12 Y13 ][ y1 ] = [ O22 ] + [ Y21 Y22 Y23 ][ y2 ]
[ O13 ] [ Z11 Z12 Z13 ][ z1 ] [ O23 ] [ Z21 Z22 Z23 ][ z2 ]
Let's call the matrices on each side M1 and M2 respectively, use the origin points as column vectors, and call the column point vectors p1 and p2. Then we can write the previous equation as:
O1 + M1 p1 = O2 + M2 p2
If your coordinate axes for each system are linearly independent, then M1 and M2 are invertible. If in addition they are orthogonal, then the inverse of each is just it's transpose! So we get:
p1 = Transpose[M1] (O2 - O1 + M2 p2) and similarly going the other way
p2 = Transpose[M2] (O1 - O2 + M1 p1)
You can read a more general treatment of change of basis here, but I think my strip-down treatment will get you writing code faster.
Having dealt with converting the Bezier Patches into triangles, I need to do a Binary Space Partition in order to draw the projected triangles using the Painter's Algorithm.
I've implemented the algorithm from Wikipedia with much help with the math.
But it's making a Charlie Brown tree! That is most of the nodes have one branch completely empty. The whole strategy is all wrong. Since the teapot is essentially spherical, the entire shape is only on one "side" of any particular component triangle.
So I'm thinking I need partitioning planes arranged more like an apple-corer: all passing through the line of the y-axis. But I'm kind of going off book, you know? What's the best way to partition the teapot?
Here's my bsp-tree generator. It uses other functions posted in the linked question.
Edit: Extra juggling to avoid dictstackoverflow. Complete program available here (requires mat.ps and teapot). The numerical output shows the depth of the tree node under construction.
% helper functions to insert and remove triangles in lists
/unshift { % [ e1 .. eN ] e0 . [ e0 e1 .. eN ]
exch aload length 1 add array astore
} def
/shift { % [ e0 e1 .. eN ] . [ e1 .. eN ] e0
aload length 1 sub array astore exch
} def
/makebsp { % [ triangles ] . bsptree
count =
%5 dict %This is the tree node data structure
<</P[]/PM[]/plane[]/front[]/behind[]/F<<>>/B<<>>>>
begin
dup length 1 le{ % If 0 or 1 triangles
dup length 0 eq { % If 0 triangles
pop % discard
}{ % If 1 triangle
aload pop /P exch def % put triangle in tree node
}ifelse
}{ % length>1
shift /P exch def % P: Partitioning Polygon (triangle)
P transpose aload pop
[1 1 1] 4 array astore % make column vectors of homogeneous coords
/PM exch def
[ % Compute equation of the plane defined by P
PM 0 3 getinterval det
[ PM 0 get PM 2 get PM 3 get ] det
[ PM 0 get PM 1 get PM 3 get ] det
PM 1 3 getinterval det 3 mul
] /plane exch def
% iterate through remaining triangles, testing against plane, adding to lists
/front [] def
/behind [] def
{ %forall [P4 P5 P6] = [[x4 y4 z4][x5 y5 z5][x6 y6 z6]]
/T exch def
T transpose % [[x4 x5 x6][y4 y5 y6][z4 z5 z6]]
{aload pop add add} forall % (x4+x5+x6) (y4+y5+y6) (z4+z5+z6)
plane 2 get mul 3 1 roll % z|C| (x) (y)
plane 1 get mul 3 1 roll % y|B| z|C| (x)
plane 0 get mul % y|B| z|C| x|A|
plane 3 get add add add % Ax+By+Cz+D
0 le { /front front
}{ /behind behind
} ifelse
T unshift def
} forall
%front == ()= behind == flush (%lineedit)(r)file pop
% recursively build F and B nodes from front and behind lists
%/F front makebsp def
front currentdict end exch
makebsp
exch begin /F exch def
%/B behind makebsp def
behind currentdict end exch
makebsp
exch begin /B exch def
/front [] def
/behind [] def
} ifelse
currentdict end
} def
Output:
BSP was invented for geometries like levels in Quake-like games and it may be hard to use for some specific geometry sets. BSP uses one of the existing triangles to split your level, so just imagine how it would behave when you want to use it on a sphere...
For the teapot you could get better results using OCTree, which doesn't need to split your geometry along existing triangles. But I'm not sure how well does it work with the Painter Algorithm.
If you really need to use BSP tree here, then you should pick your triangles carefully. I don't understand all of your code but I don't see this part here. Just iterate over all triangles in your current tree branch and for each of them compute the number of triangles in front of it and at the back. The one that has similar number of front-triangles and back-triangles is usually the best one to be used as a split plane.
I didn't quite do an octree, but I modified the bsp-tree builder to use an explicit list of planes which I filled with axis-aligned planes slicing the space -4 < x,y,z < 4.
/planelist [
0 .2 4 { /x exch def
[ 1 0 0 x ]
[ 1 0 0 x neg ]
[ 0 1 0 x ]
[ 0 1 0 x neg ]
[ 0 0 1 x ]
[ 0 0 1 x neg ]
} for
] def
Postscript program available here (requires mat.ps).
The lighter green artifact is the result of a "preview" shown during construction of the bsp. Once built, subsequent pages (images) are drawn quickly and with no artifact as the camera revolves around the teapot.
The join of the body with the spout and handles (not shown from this angle) still needs work.
With the bsp better behaved, backface culling isn't strictly necessary. But it makes the preview nicer.
Another way to improve the BSP for this image is to use a hierarchical decomposition. The teapot isn't just a bunch of bezier surfaces, it has some surfaces that describe the body, some others that describe the handle, the spout, the lid (, the bottom? ).
So the first few levels of the tree ought to be the top-level pieces. Is the handle in front of or behind the body? Is the spout in front of the body? Answers to these questions would be a useful guide for the painter's algorithm.
I'm trying to reconstruct 3D points from 2D image correspondences. My camera is calibrated. The test images are of a checkered cube and correspondences are hand picked. Radial distortion is removed. After triangulation the construction seems to be wrong however. The X and Y values seem to be correct, but the Z values are about the same and do not differentiate along the cube. The 3D points look like as if the points were flattened along the Z-axis.
What is going wrong in the Z values? Do the points need to be normalized or changed from image coordinates at any point, say before the fundamental matrix is computed? (If this is too vague I can explain my general process or elaborate on parts)
Update
Given:
x1 = P1 * X and x2 = P2 * X
x1, x2 being the first and second image points and X being the 3d point.
However, I have found that x1 is not close to the actual hand picked value but x2 is in fact close.
How I compute projection matrices:
P1 = [eye(3), zeros(3,1)];
P2 = K * [R, t];
Update II
Calibration results after optimization (with uncertainties)
% Focal Length: fc = [ 699.13458 701.11196 ] ± [ 1.05092 1.08272 ]
% Principal point: cc = [ 393.51797 304.05914 ] ± [ 1.61832 1.27604 ]
% Skew: alpha_c = [ 0.00180 ] ± [ 0.00042 ] => angle of pixel axes = 89.89661 ± 0.02379 degrees
% Distortion: kc = [ 0.05867 -0.28214 0.00131 0.00244 0.35651 ] ± [ 0.01228 0.09805 0.00060 0.00083 0.22340 ]
% Pixel error: err = [ 0.19975 0.23023 ]
%
% Note: The numerical errors are approximately three times the standard
% deviations (for reference).
-
K =
699.1346 1.2584 393.5180
0 701.1120 304.0591
0 0 1.0000
E =
0.3692 -0.8351 -4.0017
0.3881 -1.6743 -6.5774
4.5508 6.3663 0.2764
R =
-0.9852 0.0712 -0.1561
-0.0967 -0.9820 0.1624
0.1417 -0.1751 -0.9743
t =
0.7942
-0.5761
0.1935
P1 =
1 0 0 0
0 1 0 0
0 0 1 0
P2 =
-633.1409 -20.3941 -492.3047 630.6410
-24.6964 -741.7198 -182.3506 -345.0670
0.1417 -0.1751 -0.9743 0.1935
C1 =
0
0
0
1
C2 =
0.6993
-0.5883
0.4060
1.0000
% new points using cpselect
%x1
input_points =
422.7500 260.2500
384.2500 238.7500
339.7500 211.7500
298.7500 186.7500
452.7500 236.2500
412.2500 214.2500
368.7500 191.2500
329.7500 165.2500
482.7500 210.2500
443.2500 189.2500
402.2500 166.2500
362.7500 143.2500
510.7500 186.7500
466.7500 165.7500
425.7500 144.2500
392.2500 125.7500
403.2500 369.7500
367.7500 345.2500
330.2500 319.7500
296.2500 297.7500
406.7500 341.2500
365.7500 316.2500
331.2500 293.2500
295.2500 270.2500
414.2500 306.7500
370.2500 281.2500
333.2500 257.7500
296.7500 232.7500
434.7500 341.2500
441.7500 312.7500
446.2500 282.2500
462.7500 311.2500
466.7500 286.2500
475.2500 252.2500
481.7500 292.7500
490.2500 262.7500
498.2500 232.7500
%x2
base_points =
393.2500 311.7500
358.7500 282.7500
319.7500 249.2500
284.2500 216.2500
431.7500 285.2500
395.7500 256.2500
356.7500 223.7500
320.2500 194.2500
474.7500 254.7500
437.7500 226.2500
398.7500 197.2500
362.7500 168.7500
511.2500 227.7500
471.2500 196.7500
432.7500 169.7500
400.2500 145.7500
388.2500 404.2500
357.2500 373.2500
326.7500 343.2500
297.2500 318.7500
387.7500 381.7500
356.2500 351.7500
323.2500 321.7500
291.7500 292.7500
390.7500 352.7500
357.2500 323.2500
320.2500 291.2500
287.2500 258.7500
427.7500 376.7500
429.7500 351.7500
431.7500 324.2500
462.7500 345.7500
463.7500 325.2500
470.7500 295.2500
491.7500 325.2500
497.7500 298.2500
504.7500 270.2500
Update III
See answer for corrections. Answers computed above were using the wrong variables/values.
** Note all reference are to Multiple View Geometry in Computer Vision by Hartley and Zisserman.
OK, so there were a couple bugs:
When computing the essential matrix (p. 257-259) the author mentions the correct R,t pair from the set of four R,t (Result 9.19) is the one where the 3D points lay in front of both cameras (Fig. 9.12, a) but doesn't mention how one computes this. By chance I was re-reading chapter 6 and discovered that 6.2.3 (p.162) discusses depth of points and Result 6.1 is the equation needed to be applied to get the correct R and t.
In my implementation of the optimal triangulation method (Algorithm 12.1 (p.318)) in step 2 I had T2^-1' * F * T1^-1 where I needed to have (T2^-1)' * F * T1^-1. The former translates the -1.I wanted, and in the latter, to translate the inverted the T2 matrix (foiled again by MATLAB!).
Finally, I wasn't computing P1 correctly, it should have been P1 = K * [eye(3),zeros(3,1)];. I forgot to multiple by the calibration matrix K.
Hope this helps future passerby's !
It may be that your points are in a degenerate configuration. Try to add a couple of points from the scene that don't belong to the cube and see how it goes.
More information required:
What is t? The baseline might be too small for parallax.
What is the disparity between x1 and x2?
Are you confident about the accuracy of the calibration (I'm assuming you used the Stereo part of the Bouguet Toolbox)?
When you say the correspondences are hand-picked, do you mean you selected the corresponding points on the image or did you use an interest point detector on the two images are then set the correspondences?
I'm sure we can resolve this problem :)