Vulkan right handed coordinate system become Left handed - graphics

Problem:
Vulkan right handed coordinate system became left handed coordinate system after applying projection matrix. How can I make it consistent with Vulkan coordinate system?
Details:
I know that Vulkan is right handed coordinate system where
X+ points toward right
Y+ points toward down
Z+ points toward inside the screen
I've this line in the vertex shader: https://github.com/AndreaCatania/HelloVulkan/blob/master/shaders/shader.vert#L23
gl_Position = scene.cameraProjection * scene.cameraView * meshUBO.model * vec4(vertexPosition, 1.0);
At this point: https://github.com/AndreaCatania/HelloVulkan/blob/master/main.cpp#L62-L68 I'm defining the position of camera at center of scene and the position of box at (4, 4, -10) World space
The result is this:
As you can see in the picture above I'm getting Z- that point inside the screen but it should be positive.
Is it expected and I need to add something more or I did something wrong?
Useful part of code:
Projection calculation: https://github.com/AndreaCatania/HelloVulkan/blob/master/VisualServer.cpp#L88-L98
void Camera::reloadProjection(){
projection = glm::perspectiveRH_ZO(FOV, aspect, near, far);
isProjectionDirty = false;
}
Camera UBO fill: https://github.com/AndreaCatania/HelloVulkan/blob/master/VisualServer.cpp#L403-L414
SceneUniformBufferObject sceneUBO = {};
sceneUBO.cameraView = camera.transform;
sceneUBO.cameraProjection = camera.getProjection();

I do not use or know Vulcan but perspective projection matrix (at lest in OpenGL) is looking in the Z- direction which inverts one axis of your coordinate system. That inverts the winding rule of the coordinate system.
If you want to preserve original winding than just invert Z axis vector in the matrix for more info see:
Understanding 4x4 homogenous transform matrices
So just scale the Z axis by -1 either by some analogy to glScale(1.0,1.0,-1.0); or by direct matrix cells access.

All the OpenGL left coordinate system vs Vulkan right coordinate system happens during the fragment shader in NDC space, it means your view matrix doesn't care.
If you are using glm, everything you do in world space or view space is done via a right handed coordinate system.
GLM, a very popular math library that every beginner uses, uses right-handed coordinate system by default.
Your view matrix must be set accordingly, the only way to get a right handed system with x from left to right and y from bottom to top is if to set your z looking direction looking down at the negative values. If you don't provide a right handed system to your glm::lookat call, glm will convert it with one of your axis getting flipped via a series of glm::cross see glm source code
the proper way:
glm::vec3 eye = glm::vec3(0, 0, 10);
glm::vec3 up = glm::vec3(0, 1, 0);
glm::vec3 center = glm::vec3(0, 0, 0);
// looking in the negative z direction
glm::mat4 viewMat = glm::lookAt(eye, up, center);
Personnaly I store all information for coordinate system conversion in the projection matrix because by default glm doest it for you for the z coordinate
from songho: http://www.songho.ca/opengl/gl_projectionmatrix.html
Note that the eye coordinates are defined in the right-handed coordinate system, but NDC uses the left-handed coordinate system. That is, the camera at the origin is looking along -Z axis in eye space, but it is looking along +Z axis in NDC. Since glFrustum() accepts only positive values of near and far distances, we need to NEGATE them during the construction of GL_PROJECTION matrix.
Because we are looking at the negative z direction glm by default negate the sign.
It turns out that the y coordinate is flipped between vulkan and openGL so everything will get turned upside down. One way to resolve the problem is to negate the y values aswell:
glm::mat4 projection = glm::perspective(glm::radians(verticalFov), screenDimension.x / screenDimension.y, near, far);
// Vulkan NDC space points downward by default everything will get flipped
projection[1][1] \*= -1.0f;
If you follow the above step you must end up with something very similar to old openGL applications and with the up vector of your camera with the same sign than most 3D models.

Related

How do you determine if a list of of points in 3D space are in clock-wise order?

point[0] = (0,1,1)
point[1] = (1,1,1)
point[2] = (0,0,1)
point[3] = (1,0,1)
For examples below, each point above maps to an index in the visualization below.
0----------1
| |
| |
| |
3----------2
You can't.
If the points are not coplanar, it is even impossible to define an orientation.
If the points are coplanar, you can look at their plane from both sides.
If you want this information with respect to an observer, project the vertices to the viewing plane (to reduce to 2D) and compute the algebraic area by the shoelace formula. The sign tells you the orientation.
You can but only in respect to some direction ...
taking your example if you are looking on it as is its CW however if you look at it from behind its CCW ... if you look from sides (perpendicularly so the face is projected to line) we can not tell.
So the usual approach is to do a cross product of the vertices. This will give you normal vector of the face but the direction is determined by the CW/CCW. Now the result compare to reference direction by dot product. So:
vec3 p0,p1,p2; // 3 vertexes of your face not on single line
vec3 dir; // reference direction
float winding = dot( cross( p1-p0 , p2-p1 ) , dir )
Now the winding sign tells you if the face is CW or CCW in respect to dir. Which one it is depends on your notations. However this works only for convex polygons (or in convex part of concave ones) !!!
In computer graphics the reference direction is usually camera view direction. So once in camera local space coordinate system the direction is z axis so inspecting the z coordinate of the cross product is enough. This is known as face culling (skipping polygons with wrong winding in GL set by GL_CULL_FACE)...
You can look at the reference dir as an axis of rotation aorund which you are determining if the points are CW or CCW ...

Homogeneous coordinates and perspective-correctness?

Does the technique that vulkan uses (and I assume other graphics libraries too) to interpolate vertex attributes in a perspective-correct manner require that the vertex shader must normalize the homogenous camera-space vertex position (ie: divide through by the w-coordinate such that the w-coordinate is 1.0) prior to multiplication by a typical projection matrix of the form...
g/s 0 0 0
0 g 0 n
0 0 f/(f-n) -nf/(f-n)
0 0 1 0
...in order for perspective-correctness to work properly?
Or, will perspective-correctness continue to work on any homogeneous vertex position in camera-space (with a w-coordinate other than 1.0)?
(I didn't completely follow the perspective-correctness math, so it is unclear which to me which is the case.)
Update:
In order to clarify terminology:
vec4 modelCoordinates = vec4(x_in, y_in, z_in, 1);
mat4 modelToWorld = ...;
vec4 worldCoordinates = modelToWorld * modelCoordinates;
mat4 worldToCamera = ...;
vec4 cameraCoordinates = worldToCamera * worldCoordinates;
mat4 cameraToProjection = ...;
vec4 clipCoordinates = cameraToProjection * cameraCoordinates;
output(clipCoordinates);
cameraToProjection is a matrix like the one shown in the question
The question is does cameraCoordinates.w have to be 1.0?
And consequently the last row of both the modelToWorld and worldToCamera matricies have to be 0 0 0 1?
You have this exactly backwards. Doing the perspective divide in the shader is what prevents perspective-correct interpolation. The rasterizer needs the perspective information provided by the W component to do its job. With a W of 1, the interpolation is done in window space, without any regard to perspective.
Provide a clip-space coordinate to the output of your vertex processing stage, and let the system do what it exists to do.
the vertex shader must normalize the homogenous camera-space vertex position (ie: divide through by the w-coordinate such that the w-coordinate is 1.0) prior to multiplication by a typical projection matrix of the form...
If your camera-space vertex position does not have a W of 1.0, then one of two things has happened:
You are deliberately operating in a post-projection world space or some similar construct. This is a perfectly valid thing to do, and the math for a camera space can be perfectly reasonable.
Your code is broken somewhere. That is, you intend for your world and camera space to be a normal, Euclidean, non-homogeneous space, but somehow the math didn't work out. Obviously, this is not a perfectly valid thing to do.
In both cases, dividing by W is the wrong thing to do. If your world space that you're placing a camera into is post-projection (such as in this example), dividing by W will break your perspective-correct interpolation, as outlined above. If your code is broken, dividing by W will merely mask the actual problem; better to fix your code than to hide the bug, as it may crop up elsewhere.
To see whether or not the camera coordinates need to be in normal form, let's represent the camera coordinates as multiples of w, so they are (wx,wy,wz,w).
Multiplying through by the given projection matrix, we get the clip coordinates (wxg/s, wyg, fwz/(f-n)-nfw/(f-n)), wz)
Calculating the x-y framebuffer coordinates as per the fixed Vulkan formula we get (P_x * xg/sz +O_x, P_y * Hgy/z + O_y). Notice this does not depend on w, so the position in the framebuffer of a polygons verticies doesn't require the camera coordinates be in normal form.
Likewise calculation of the barycentric coordinates of fragments within a polygon only depends on x,y in framebuffer coordinates, and so is also independant of w.
However perspective-correct perspective interpolation of fragment attributes does depend on W_clip of the verticies as this is used in the formula given in the Vulkan spec. As shown above W_clip is wz which does depend on w and scales with it, so we can conclude that camera coordinates must be in normal form (their w must be 1.0)

Why does the projection of an image over 3d points show this distortion?

I have a question regarding the projection of an image over a set of 3D points. The image is given to me as a JPG, together with position and attitude information of the camera relative to a cartesian coordinate system (Xc,Yc,Zc and yaw, pitch, roll), as well as the horizontal and vertical field of view (in degrees).
Points are given using solely their 3d position in the same coordinate system (Xp,Yp,Zp).
In my coordinate system, Z is up. To project the image onto the points, I
compute the vector from camera to each point
Vector3 c2p = (Xp,Yp,Zp)-(Xc,Yc,Zc);
rotate c2p according to my camera's attitude (quaternion):
Vector3 c2pCamFrame = getCamQuaternion().conjugate().rotate(c2p);
compute azimuth and elevation from the camera's "center ray" to the point:
float azimuth = atan2(c2pCamFrame.x(),c2pCamFrame.y()));
float elevation = atan2(c2pCamFrame.z(),sqrt(pow(c2pCamFrame.x(),2)+pow(c2pCamFrame.y(),2)));
if azimuth and elevation are within the field of view, I assign the color of the corresponding pixel to the point.
This works almost perfectly, and the "almost" motivates my question. Let me show you:
I cannot figure out why the elevation of the projection is distorted. In the bottom right of the image, you can see that points outside the frustum (exceeding the elevation) actually become colored - and this distortion is null at an azimuth of 0 degrees and peaks at the left and right edges of the image, creating the pillow distortion.
Why does this distortion appear? I'd love to understand this problem both in geometrical as well as mathematical terms. Thank you!
The field of view angles are only valid on the principal axes. But you can do it the other way around. I.e. calculate the x/y bounds from the angles:
maxX = tan(horizontal_fov / 2)
maxY = tan(vertical_fov / 2)
And check
if(abs(c2pCamFrame.x() / c2pCamFrame.z()) <= maxX
&& abs(c2pCamFrame.y() / c2pCamFrame.z()) <= maxY)
Additionally, you might want to check if the points are in front of the camera:
... && c2pCamFrame.z() > 0
This assumes a left-handed coordinate system.

Graphics: Creating a 3D cylinder

I have a problem with creating 3D cylinders (without OpenGL). I understand that a mesh is used to create the cylinder surface and triangle fans are used to create the top and bottom caps. I have already implemented the mesh but not the planar triangle fans, so currently my 3D object looks like a cylinder without the bottom and top cap.
I believe this is what I need to do in order to create the bottom and top caps. First, find the center point of the cylinder mesh. Second, find the vertices of the mesh. Third, using the center point and the 2 vertex points, create the triangle. Fourth, repeat the steps until a planar circle is created.
Are the above steps a sufficient way of creating the caps or is there a better way? And how do I find the vertices of the mesh so I can create the triangle fans?
First some notes:
you did not specify your platform
gfx interface
language
not enough info about your cylinder either
is it axis aligned?
what coordinate system (Cartesian/orthogonal/orthonormal)?
need additional dimensions like color or texture coordinates?
So I can provide just generic info then
Axis aligned cylinder
choose the granularity N
number of points along your cap's circle
usually 20-36 is OK but if you need higher precision then sometimes you need even 1000 points or more
all depends on the purpose,zoom, angle and distance of view ...
and performance issues
for now let N=32
you need BR (boundary representation)
you did not specify gfx interface but your text implies BR model (surface polygons)
also no pivot point position so I will choose middle point of cylinder to be (0,0,0)
z axis will be the height of cylinder
and the caps will be coplanar with xy plane
so for cylinder is enough set of 2 rings (caps)
so the points can be defined in C++ like this:
const int N=32; // mesh complexity
double p0[N][3],p1[N][3]; // rings`
double a,da,c,s,r,h2; // some temp variables
int i;
r =50.0; // cylinder radius
h2=100.0*0.5; // half height of cyliner
da=M_PI/double(N-1);
for (a=0.0,i=0;i<N;i++,a+=da)
{
c=r*cos(a);
s=r*sin(a);
p0[i][0]=c;
p0[i][1]=s;
p0[i][2]=+h2;
p1[i][0]=c;
p1[i][1]=s;
p1[i][2]=-h2;
}
the ring points are as closed loop (p0[0]==p0[N-1])
so you do not need additional lines to handle it...
now how to draw
cant write the code for unknown api but
'mesh' is something like QUAD_STRIP I assume
so just add points to it in this order:
QUAD_STRIP = { p0[0],p1[0],p0[1],p1[1],...p0[N-1],p1[N-1] };
if you have inverse normal problem then swap p0/p1
now for the fans
you do not need the middle point (unless you have interpolation aliasing issues)
so similar:
TRIANGLE_FAN0 = { p0[0],p0[1],...p0[N-1] };
TRIANGLE_FAN1 = { p1[0],p1[1],...p1[N-1] };
if you still want the middle point then:
TRIANGLE_FAN0 = { (0.0,0.0,+h2),p0[0],p0[1],...p0[N-1] };
TRIANGLE_FAN1 = { (0.0,0.0,-h2),p1[0],p1[1],...p1[N-1] };
if you have inverse normal problem then reverse the points order (middle point stays where it is)
Not axis aligned cylinder?
just use transform matrix on your p0[],p1[] point lists to translate/rotate to desired position
the rest stays the same

3D Graphics Algorithms (Hardware)

I am trying to design an asic graphics processor. I have done extensive research on the topic but I am still kind of fuzzy on how to translate and rotate points. I am using orthographic projection to rasterize the transformed points.
I have been using the following lecture regarding the matrix multiplication (homogenous coordinates)
http://www.cs.kent.edu/~zhao/gpu/lectures/Transformation.pdf
Could someone please explain this a little more in depth to me. I am still somewhat shakey on the algorithm. I am passing a camera (x,y,z) and a camera vector (x,y,z) representing the camera angle, along with a point (x,y,z). What should go where within the matrices to transform the point to the new appropriate location?
Here's the complete transformation algorithm in pseudocode:
void project(Vec3d objPos, Matrix4d modelViewMatrix,
Matrix4d projMatrix, Rect viewport, Vec3d& winCoords)
{
Vec4d in(objPos.x, objPos.y, objPos.z, 1.0);
in = projMatrix * modelViewMatrix * in;
in /= in.w; // perspective division
// "in" is now in normalized device coordinates, which are in the range [-1, 1].
// Map coordinates to range [0, 1]
in.x = in.x / 2 + 0.5;
in.y = in.y / 2 + 0.5;
in.z = in.z / 2 + 0.5;
// Map to viewport
winCoords.x = in.x * viewport.w + viewport.x;
winCoords.y = in.y * viewport.h + viewport.y;
winCoords.z = in.z;
}
Then rasterize using winCoords.x and winCoords.y.
For an explanation of the stages of this algorithm, see question 9.011 from the OpenGL FAQ.
For the first few years they were for sale, mass-market graphics processors for PC didn't translate or rotate points at all. Are you required to implement this feature? If not, you may wish to let software do it. Depending on your circumstances, software may be the more sensible route.
If you are required to implement the feature, I'll tell you how they did it in the early days.
The hardware has sixteen floating point registers that represent a 4x4 matrix. The application developer loads these registers with the ModelViewProjection matrix just before rendering a mesh of triangles. The ModelViewProjection matrix is:
Model * View * Projection
Where "Model" is a matrix that brings vertices from "model" coordinates into "world" coordinates, "View" is a matrix that brings vertices from "world" coordinates into "camera" coordinates, and "Projection" is a matrix that brings vertices from "camera" coordinates to "screen" coordinates. Together they bring vertices from "model" coordinates - coordinates relative to the 3D model they belong to - into "screen" coordinates, where you intend to rasterize them as triangles.
Those are three different matrices, but they're multiplied together and the 4x4 result is written to hardware registers.
When a buffer of vertices is to be rendered as triangles, the hardware reads in vertices as [x,y,z] vectors from memory, and treats them as if they were [x,y,z,w] where w is always 1. It then multiplies each vector by the 4x4 ModelViewProjection matrix to get [x',y',z',w']. If there is perspective (you said there wasn't) then we divide by w' to get perspective [x'/w',y'/w',z'/w',w'/w'].
Then triangles are rasterized with the newly computed vertices. This enables a model's vertices to be in read-only memory if desired, though the model and camera may be in motion.

Resources