Get camera matrix from object transformation matrix - graphics

I have a transformation matrix of 3d object in world space with a fixed camera position. I would like to derive the camera matrix (position, look up vector, right vector) if the object wasn't transformed and the camera was transformed instead. How could I compute that? Hope my question makes sense

Premise. Let's represent coordinate transformations with 4x4 matrices. Specifically, the 4x4 matrix Qab representing the coordinate transform of frame a from frame b is such that:
Its 3x3 upper-left submatrix is the rotation matrix Rab, i.e. the 3x3 orthonormal matrix whose columns are the ordinately the components of the x_b, y_b, z_b unit vectors of frame b, decomposed in frame a.
Its 3x1 upper-right submatrix is the translation vector t_ab from the origin of frame a to the origin of frame b, decomposed in frame a.
Its 4th row is [0, 0, 0, 1].
If p is a point whose coordinates in frame b are p_b= [px_b, py_b, pz_b, 1], then the coordinates p_a= [px_a, py_a, pz_a, 1] of the same point in frame in frame a are given by p_a.T=Qab * p_b.T, where x.T means the transposed of vector x. Note that we append a 1 as a dummy fourth coordinate in order to be able to multiply 3D points by 4x4 matrices.
Now, to your question. Let Qcw be the 4x4 matrix representing the rotation and translation of the camera from the world reference frame, and Qow the analogous transformation of the object from the world.
Then your answer is the camera-from-object transform Qco. We can compute it by noting that we can go from frame o to frame c by first going from o to w, and then from w to c. Therefore it is Qco=Qcw * Qwo, where Qwo=inv(Qow) is the inverse of Qow, and represents the world frame as seen from the object.

Related

Scale and translate path from one coordinate system to another

I have a path output as shown in the image below, in a coordinate system 1 wherein the start point and the end point are (40,40) and (10,20) respectively.
I want to scale this path to a new coordinate system (coordinate system 2) with a known start and end point, the path has to scale and adjust between the new points.
I believe Affine transforms might help / linear algebra.
How do I achieve this ? and will this be accurate or will it distort ?
To find appropriate affine tranformation (there are many ways to transform two points into two another ones, but we choose the simplest way), you can apply these elementary steps:
Shift coordinates by (-startx, -starty)
Scale along X-axis with coefficient (newendx-newstartx)/(endx-startx) (here -80/3)
Scale along Y-axis with coefficient (newendy-newstarty)/(endy-starty) (here -35)
Shift coordinates by (newstartx, newstarty)
Resulting affine tranformation is product of these four matrices
Using Wolfram alpha to get matrix
M == {{c, 0, 0},
{0, d, 0},
{a*c + e, b*d + f, 1}}
where a,b,c,d,e,f are values from decription above (a = -startx and so on)
Now transform coordinates with multiplication of point coordinates and matrix M
(x, y, 1) * M = (newx, newy, 1)

What's the different between using modelViewmatrix directly and using normalMatrix instead? [duplicate]

I am working on some shaders, and I need to transform normals.
I read in few tutorials the way you transform normals is you multiply them with the transpose of the inverse of the modelview matrix. But I can't find explanation of why is that so, and what is the logic behind that?
It flows from the definition of a normal.
Suppose you have the normal, N, and a vector, V, a tangent vector at the same position on the object as the normal. Then by definition N·V = 0.
Tangent vectors run in the same direction as the surface of an object. So if your surface is planar then the tangent is the difference between two identifiable points on the object. So if V = Q - R where Q and R are points on the surface then if you transform the object by B:
V' = BQ - BR
= B(Q - R)
= BV
The same logic applies for non-planar surfaces by considering limits.
In this case suppose you intend to transform the model by the matrix B. So B will be applied to the geometry. Then to figure out what to do to the normals you need to solve for the matrix, A so that:
(AN)·(BV) = 0
Turning that into a row versus column thing to eliminate the explicit dot product:
[tranpose(AN)](BV) = 0
Pull the transpose outside, eliminate the brackets:
transpose(N)*transpose(A)*B*V = 0
So that's "the transpose of the normal" [product with] "the transpose of the known transformation matrix" [product with] "the transformation we're solving for" [product with] "the vector on the surface of the model" = 0
But we started by stating that transpose(N)*V = 0, since that's the same as saying that N·V = 0. So to satisfy our constraints we need the middle part of the expression — transpose(A)*B — to go away.
Hence we can conclude that:
transpose(A)*B = identity
=> transpose(A) = identity*inverse(B)
=> transpose(A) = inverse(B)
=> A = transpose(inverse(B))
My favorite proof is below where N is the normal and V is a tangent vector. Since they are perpendicular their dot product is zero. M is any 3x3 invertible transformation (M-1 * M = I). N' and V' are the vectors transformed by M.
To get some intuition, consider the shear transformation below.
Note that this does not apply to tangent vectors.
Take a look at this tutorial:
https://paroj.github.io/gltut/Illumination/Tut09%20Normal%20Transformation.html
You can imagine that when the surface of a sphere stretches (so the sphere is scaled along one axis or something similar) the normals of that surface will all 'bend' towards each other. It turns out you need to invert the scale applied to the normals to achieve this. This is the same as transforming with the Inverse Transpose Matrix. The link above shows how to derive the inverse transpose matrix from this.
Also note that when the scale is uniform, you can simply pass the original matrix as normal matrix. Imagine the same sphere being scaled uniformly along all axes, the surface will not stretch or bend, nor will the normals.
If the model matrix is made of translation, rotation and scale, you don't need to do inverse transpose to calculate normal matrix. Simply divide the normal by squared scale and multiply by model matrix and we are done. You can extend that to any matrix with perpendicular axes, just calculate squared scale for each axes of the matrix you are using instead.
I wrote the details in my blog: https://lxjk.github.io/2017/10/01/Stop-Using-Normal-Matrix.html
Don't understand why you just don't zero out the 4th element of the direction vector before multiplying with the model matrix. No inverse or transpose needed. Think of the direction vector as the difference between two points. Move the two points with the rest of the model - they are still in the same relative position to the model. Take the difference between the two points to get the new direction, and the 4th element, cancels out to zero. Lot cheaper.

Converting a series of depth maps and x, y, z, theta values into a 3D model

I have a quadrotor which flies around and knows its x, y, z positions and angular displacement along the x, y, z axis. It captures a constant stream of images which are converted into depth maps (we can estimate the distance between each pixel and the camera).
How can one program an algorithm which converts this information into a 3D model of the environment? That is, how can we generate a virtual 3D map from this information?
Example: below is a picture that illustrates what the quadrotor captures (top) and what the image is converted into to feed into a 3D mapping algorithm (bottom)
Let's suppose this image was taken from a camera with x, y, z coordinates (10, 5, 1) in some units and angular displacement of 90, 0, 0 degrees about the x, y, z axes. What I want to do is take a bunch of these photo-coordinate tuples and convert them into a single 3D map of the area.
Edit 1 on 7/30: One obvious solution is to use the angle of the quadrotor wrt to x, y, and z axes with the distance map to figure out the Cartesian coordinates of any obstructions with trig. I figure I could probably write an algorithm which uses this approach with a probabilistic method to make a crude 3D map, possibly vectorizing it to make it faster.
However, I would like to know if there is any fundamentally different and hopefully faster approach to solving this?
Simply convert your data to Cartesian and store the result ... As you have known topology (spatial relation between data points) of the input data then this can be done to map directly to mesh/surface instead of to PCL (which would require triangulation or convex hull etc ...).
Your images suggest you have known topology (neighboring pixels are neighboring also in 3D ...) so you can construct mesh 3D surface directly:
align both RGB and Depth 2D maps.
In case this is not already done see:
Align already captured rgb and depth images
convert to Cartesian coordinate system.
First we compute the position of each pixel in camera local space:
so each pixel (x,y) in RGB map we find out the Depth distance to camera focal point and compute the 3D position relative to the camera focal point.For that we can use triangle similarity so:
x = camera_focus.x + (pixel.x-camera_focus.x)*depth(pixel.x,pixel.y)/focal_length
y = camera_focus.y + (pixel.y-camera_focus.y)*depth(pixel.x,pixel.y)/focal_length
z = camera_focus.z + depth(pixel.x,pixel.y)
where pixel is pixel 2D position, depth(x,y) is coresponding depth, and focal_length=znear is the fixed camera parameter (determining FOV). the camera_focus is the camera focal point position. Its usual that camera focal point is in the middle of the camera image and znear distant to the image (projection plane).
As this is taken from moving device you need to convert this into some global coordinate system (using your camera positon and orientation in space). For that are the best:
Understanding 4x4 homogenous transform matrices
construct mesh
as your input data are already spatially sorted we can construct QUAD grid directly. Simply for each pixel take its neighbors and form QUADS. So if 2D position in your data (x,y) is converted into 3D (x,y,z) with approach described in previous bullet we can write iot in form of function that returns 3D position:
(x,y,z) = 3D(x,y)
Then I can form QUADS like this:
QUAD( 3D(x,y),3D(x+1,y),3D(x+1,y+1),3D(x,y+1) )
we can use for loops:
for (x=0;x<xs-1;x++)
for (y=0;y<ys-1;y++)
QUAD( 3D(x,y),3D(x+1,y),3D(x+1,y+1),3D(x,y+1) )
where xs,ys is the resolution of your maps.
In case you do not know camera properties you can set the focal_length to any reasonable constant (resulting in fish eye effects and or scaled output) or infer it from input data like:
Transformation of 3D objects related to vanishing points and horizon line

Transforming a 3D plane using a 4x4 matrix

I have a shape made out of several triangles which is positioned somewhere in world space with scale, rotate, translate. I also have a plane on which I would like to project (orthogonal) the shape.
I could multiply every vertex of every triangle in the shape with the objects transformation matrix to find out where it is located in world coordinates, and then project this point onto the plane.
But I don't need to draw the projection, and instead I would like to transform the plane with the inverse transformation matrix of the shape, and then project all the vertices onto the (inverse transformed) plane. Since it only requires me to transform the plane once and not every vertex.
My plane has a normal (xyz) and a distance (d). How do I multiply it with a 4x4 transformation matrix so that it turns out ok?
Can you create a vec4 as xyzd and multiply that? Or maybe create a vector xyz1 and then what to do with d?
You need to convert your plane to a different representation. One where N is the normal, and O is any point on the plane. The normal you already know, it's your (xyz). A point on the plane is also easy, it's your normal N times your distance d.
Transform O by the 4x4 matrix in the normal way, this becomes your new O. You will need a Vector4 to multiply with a 4x4 matrix, set the W component to 1 (x, y, z, 1).
Also transform N by the 4x4 matrix, but set the W component to 0 (x, y, z, 0). Setting the W component to 0 means that your normals won't get translated. If your matrix is composed of more that just translating and rotating, then this step isn't so simple. Instead of multiplying by your transformation matrix, you have to multiply by the transpose of the inverse of the matrix i.e. Matrix4.Transpose(Matrix4.Invert(Transform)), there's a good explanation on why here.
You now have a new normal vector N and a new position vector O. However I suppose you want it in xyzd form again? No problem. As before, xyz is your normal N all that's left is to calculate d. d is the distance of the plane from the origin, along the normal vector. Hence, it is simply the dot product of O and N.
There you have it! If you tell me what language you're doing this in, I'd happily type it up in code as well.
EDIT, In pseudocode:
The plane is vector3 xyz and number d, the matrix is a matrix4x4 M
vector4 O = (xyz * d, 1)
vector4 N = (xyz, 0)
O = M * O
N = transpose(invert(M)) * N
xyz = N.xyz
d = dot(O.xyz, N.xyz)
xyz and d represent the new plane
This question is a bit old but I would like to correct the accepted answer.
You do not need to convert your plane representation.
Any point lies on the plane if
It can be written as dot product :
You are looking for the plane transformed by your 4x4 matrix .
For the same reason, you must have
So and with some arrangements
TLDR : if p=(a,b,c,d), p' = transpose(inverse(M))*p
Notation:
n is a normal represented as a (1x3) row-vector
n' is the transformed normal of n according to transform matrix T
(n|d) is a plane represented as a (1x4) row-vector (with n the plane's normal and d the plane's distance to the origin)
(n'|d') is the transformed plane of (n|d) according to transform matrix T
T is a (4x4) (affine) column-major transformation matrix (i.e. transforming a column-vector t is defined as t' = T t).
Transforming a normal n:
n' = n adj(T)
Transforming a plane (n|d):
(n'|d') = (n|d) adj(T)
Here, adj is the adjugate of a matrix which is defined as follows in terms of the inverse and determinant of a matrix:
T^-1 = adj(T)/det(T)
Note:
The adjugate is generally not equal to the inverse of a transformation matrix T. If T includes a reflection, det(T) = -1, reversing the winding order!
Re-normalizing n' is mathematically not required (but maybe numerically depending on the implementation) since scaling is taken care off by the determinant. Thanks to Adrian Leonhard.
You can directly transform the plane without first decomposing and recomposing a plane (normal and point).

Convert 3D(x,y,z) to 2D(x,y) (orthogonal) along its direction

I have gone through all available study resources in the internet as much as possible, which are in form of simple equations, vectors or trigonometric equations.
I couldn't find the way of doing following thing:
Assuming Y is up in a 3D world.
I need to draw two 2D trajectories orthogonally (not the projections) for a 3D trajectory, say XY-plane for side view of the trajectory w.r.t. the trajectory itself and XZ-plane for top view for the same.
I have all the 3D points of the 3D trajectory, initial velocity, both the angles can be calculated by vector mathematics.
How should I proceed further?
refer:
Below a curve in different angles, which can loose its significance if projected along XY-plane. All I want is to convert the red curve along itself, the green curve along green curve and so on. and further how would I map side view to a plane. Top view is comparatively easy and done just by taking X and Z ordinates of each points.
I mean this the requirement. :)
I don't think I understand the question, but I'll answer my interpretation anyway.
You have a 3D trajectory described by a sequence of points p0, ..., pN. We are given an angle v for a plane P parallel to the Y-axis, and wish to compute the 2D coordinates (di, hi) of the points pi projected onto that plane, where hi is the height coordinate in the direction Y and di is the distance coordinate in the direction v. Assume p0 = (0, 0, 0) or else subtract p0 from all vectors.
Let pi = (xi, yi, zi). The height coordinate is hi = yi. Assume the angle v is given relative to the Z-axis. The vector for the direction v is then r = (sin(v), 0, cos(v)), and the distance coordinates becomes di = dot(pi, r).

Resources