Physically Based Shading, IBL, Half Vector, and NDotR vs NDotV - graphics

I'm trying to figure out some simple concepts about image based lighting for PBR. In many documents and code, I've seen the light direction (FragToLightDir) being set to the reflection vector (reflect(EyeToFragDir,Normal)). Then they set the half vector to the mid-way point between the light and view direction: HalfVecDir = normalize(FragToLightDir+FragToEyeDir); But doesn't this just result in the half vector being identical to the surface normal? If so, this would mean that terms like NDotH are always 1.0. Is this correct?
Here is another source of confusion for me. I'm trying to implement specular cube maps from the app Lys, using their algorithm for generating the correct roughness value to use for mip-level sampling based on roughness (here: https://docs.knaldtech.com/doku.php?id=specular_lys#pre-convolved_cube_maps_vs_path_tracers in the section Pre-convolved Cube Maps vs Path Tracers). In this document, they ask us to use NDotR as a scalar. But what is this NDotR in respect to IBL? If it means dot(Normal,ReflectDir), then isn't that exactly equivalent to dot(Normal,FragToEyeDir)? If I use either of these dot product results, the final result is too glossy at grazing angles (when compared to their more simplistic conversion using BurleyToMipSimple()), which makes me think I'm misunderstanding something about this process. I've tested the algorithm using NDotH, and it looks correct, but isn't this simply canceling out the rest of the math, since NDotH==1.0? Here is my very simple function to extract the mip level using their suggested logic:
float computeSpecularCubeMipTest(float perc_ruf)
{
//float n_dot_r = dot( Normal, Reflect );
float specular_power = ( 2.0 / max( EPSILON, perc_ruf*perc_ruf*perc_ruf*perc_ruf ) ) - 2.0;
specular_power /= ( 4.0 * max( NDotR, EPSILON ) );
return sqrt( max( EPSILON, sqrt(2.0/( specular_power + 2.0 )))) * MipScaler;
}
I realize this is an esoteric subject. Since everyone is using popular game engines these days, no one is forced to understand this madness! But I appreciate any advice on how to go about this.
Edit: Just to make sure I'm clear, I'm referring to pure image based lighting, with no directional lights, no spot lights, etc. Just a cube map that lights the whole scene, similar to the lighting in apps like Substance Painter and Blender's Viewport shading mode.

I'm not familiar with this particular app, but it looks like you're on the right track here. Part of the advantage of pre-convoluting the cube maps is to customize each pixel to be the light source for a particular reflection vector, so indeed NdotV is identical to NdotR as you've noticed. The R still needs to be calculated, for the texture lookup, so it doesn't matter much which one you use for the dot. There is no such thing as H or NdotH used for IBL lookups; those are only for point lights.
If the grazing angles look wrong, perhaps there's a Fresnel term missing somewhere? Reflections start to work differently at those angles.
For what it's worth, the glTF Sample Viewer is using NdotV for its specular IBL lookup.

Related

Most efficient and effective way to create a surface from 3d points

Say I had a point cloud with n number of points in 3d space(relatively densely packed together). What is the most efficient way to create a surface that goes contains every single point in it and lets me calculate values such as the normal and curvature at some point on the surface that was created? I also need to be able to create this surface as fast as possible(a few milliseconds hopefully working with python) and it can be assumed that n < 1000.
There is no "most efficient and effective" way (this is true of any problem in any domain).
In the first place, the surface you have in mind is not mathematically defined uniquely.
A possible approach is by means of the so-called Alpha-shapes, implemented either from a Delaunay tetrahedrization, or by the ball-pivoting method. For other methods, lookup "mesh reconstruction" or "surface reconstruction".
On another hand, normals and curvature can be computed locally, from neighbors configurations, without reconstructing a surface (though there is an ambiguity on the orientation of the normals).
I could suggest Nina Amenta's Power Crust algorithm (link to code), or also meshlab suite, which can compute the curvatures too.

How to create a first-person "space flight" camera

I'm currently attempting to create a first-person space flight camera.
First, allow me to define what I mean by that.
Notice that I am currently using Row-Major matrices in my math library (meaning, the basis vectors in my 4x4 matrices are laid out in rows, and the affine translation part is in the fourth row). Hopefully this helps clarify the order in which I multiply my matrices.
What I have so Far
So far, I have successfully implemented a simple first-person camera view. The code for this is as follows:
fn fps_camera(&mut self) -> beagle_math::Mat4 {
let pitch_matrix = beagle_math::Mat4::rotate_x(self.pitch_in_radians);
let yaw_matrix = beagle_math::Mat4::rotate_y(self.yaw_in_radians);
let view_matrix = yaw_matrix.get_transposed().mul(&pitch_matrix.get_transposed());
let translate_matrix = beagle_math::Mat4::translate(&self.position.mul(-1.0));
translate_matrix.mul(&view_matrix)
}
This works as expected. I am able to walk around and look around with the mouse.
What I am Attempting to do
However, an obvious limitation of this implementation is that since pitch and yaw is always defined relative to a global "up" direction, the moment I pitch more than 90 degrees, getting the world to essentially being upside-down, my yaw movement is inverted.
What I would like to attempt to implement is what could be seen more as a first-person "space flight" camera. That is, no matter what your current orientation is, pitching up and down with the mouse will always translate into up and down in the game, relative to your current orientation. And yawing left and right with your mouse will always translate into a left and right direction, relative to your current orientation.
Unfortunately, this problem has got me stuck for days now. Bear with me, as I am new to the field of linear algebra and matrix transformations. So I must be misunderstanding or overlooking something fundamental. What I've implemented so far might thus look... stupid and naive :) Probably because it is.
What I've Tried so far
The way that I always end up coming back to thinking about this problem is to basically redefine the world's orientation every frame. That is, in a frame, you translate, pitch, and yaw the world coordinate space using your view matrix. You then somehow redefine this orientation as being the new default or zero-rotation. By doing this, you can then, in your next frame apply new pitch and yaw rotations based on this new default orientation, which (by my thinking, anyways), would mean that mouse movement will always translate directly to up, down, left, and right, no matter how you are oriented, because you are basically always redefining the world coordinate space in terms relative to what your previous orientation was, as opposed to the simple first-person camera, which always starts from the same initial coordinate space.
The latest code I have which attempts to implement my idea is as follows:
fn space_camera(&mut self) -> beagle_math::Mat4 {
let previous_pitch_matrix = beagle_math::Mat4::rotate_x(self.previous_pitch);
let previous_yaw_matrix = beagle_math::Mat4::rotate_y(self.previous_yaw);
let previous_view_matrix = previous_yaw_matrix.get_transposed().mul(&previous_pitch_matrix.get_transposed());
let pitch_matrix = beagle_math::Mat4::rotate_x(self.pitch_in_radians);
let yaw_matrix = beagle_math::Mat4::rotate_y(self.yaw_in_radians);
let view_matrix = yaw_matrix.get_transposed().mul(&pitch_matrix.get_transposed());
let translate_matrix = beagle_math::Mat4::translate(&self.position.mul(-1.0));
// SAVES
self.previous_pitch += self.pitch_in_radians;
self.previous_yaw += self.yaw_in_radians;
// RESETS
self.pitch_in_radians = 0.0;
self.yaw_in_radians = 0.0;
translate_matrix.mul(&(previous_view_matrix.mul(&view_matrix)))
}
This, however, does nothing to solve the issue. It actually gives the exact same result and problem as the fps camera.
My thinking behind this code is basically: Always keep track of an accumulated pitch and yaw (in the code that is the previous_pitch and previous_yaw) based on deltas each frame. The deltas are pitch_in_radians and pitch_in_yaw, as they are always reset each frame.
I then start off by constructing a view matrix that would represent how the world was orientated previously, that is the previous_view_matrix. I then construct a new view matrix based on the deltas of this frame, that is the view_matrix.
I then attempt to do a view matrix that does this:
Translate the world in the opposite direction of what represents the camera's current position. Nothing is different here from the FPS camera.
Orient that world according to what my orientation has been so far (using the previous_view_matrix. What I would want this to represent is the default starting point for the deltas of my current frame's movement.
Apply the deltas of the current frame using the current view matrix, represented by view_matrix
My hope was that in step 3, the previous orientation would be seen as a starting point for a new rotation. That if the world was upside-down in the previous orientation, the view_matrix would apply a yaw in terms of the camera's "up", which would then avoid the problem of inverted controls.
I must surely be either attacking the problem from the wrong angle, or misunderstanding essential parts of matrix multiplication with rotations.
Can anyone help pin-point where I'm going wrong?
[EDIT] - Rolling even when you only pitch and yaw the camera
For anyone just stumbling upon this, I fixed it by a combination of the marked answer and Locke's answer (ultimately, in the example given in my question, I also messed up the matrix multiplication order).
Additionally, when you get your camera right, you may stumble upon the odd side-effect that holding the camera stationary, and just pitching and yawing it about (such as moving your mouse around in a circle), will result in your world slowly rolling as well.
This is not a mistake, this is how rotations work in 3D. Kevin added a comment in his answer that explains it, and additionally, I also found this GameDev Stack Exchange answer explaining it in further detail.
The problem is that two numbers, pitch and yaw, provide insufficient degrees of freedom to represent consistent free rotation behavior in space without any “horizon”. Two numbers can represent a look-direction vector but they cannot represent the third component of camera orientation, called roll (rotation about the “depth” axis of the screen). As a consequence, no matter how you implement the controls, you will find that in some orientations the camera rolls strangely, because the effect of trying to do the math with this information is that every frame the roll is picked/reconstructed based on the pitch and yaw.
The minimal solution to this is to add a roll component to your camera state. However, this approach (“Euler angles”) is both tricky to compute with and has numerical stability issues (“gimbal lock”).
Instead, you should represent your camera/player orientation as a quaternion, a mathematical structure that is good for representing arbitrary rotations. Quaternions are used somewhat like rotation matrices, but have fewer components; you'll multiply quaternions by quaternions to apply player input, and convert quaternions to matrices to render with.
It is very common for general purpose game engines to use quaternions for describing objects' rotations. I haven't personally written quaternion camera code (yet!) but I'm sure the internet contains many examples and longer explanations you can work from.
It looks like a lot of the difficulty you are having is due to trying to normalize the transformation to apply the new translation. It seems like this is probably a large part of what is tripping you up. I would suggest changing how you store your position and rotation. Instead, try letting your view matrix define your position.
/// Apply rotation based on the change in mouse position
pub fn on_mouse_move(&mut self, dx: f32, dy: f32) {
// I think this is correct, but it might need tweaking
let rotation_matrix = Mat4::rotate_xy(-y, x);
self.apply_movement(&rotation_matrix, &Vec3::zero())
}
/// Append axis-aligned movement relative to the camera and rotation
pub fn apply_movement(&mut self, rotation: &Mat4<f32>, translation: &Vec3<f32>) {
// Create transformation matrix for translation
let translation = Mat4::translate(translation);
// Append translation and rotation to existing view matrix
self.view_matrix = self.view_matrix * translation * rotation;
}
/// You can get the position from the last column [x, y, z, w] of your view matrix.
pub fn translation(&self) -> Vec3<f32> {
self.view_matrix.column(3).into()
}
I made a couple assumptions about the library:
Mat4 implements Mul<Self> so you do not need to call x.mul(y) explicitly and can instead use x * y. Same goes for Sub.
There exists a Mat4::rotate_xy function. If there isn't one, it would be equivalent to Mat4::rotate_xyz(delta_pitch, delta_yaw, 0.0) or Mat4::rotate_x(delta_pitch) * Mat4::rotate_y(delta_yaw).
I'm somewhat eyeballing the equations so hopefully this is correct. The main idea is to take the delta from the previous inputs and create matrices from that which can then be added on top of the previous view_matrix. If you attempt to take the difference after creating transformation matrices it will only be more work for you (and your processor).
As a side note I see you are using self.position.mul(-1.0). This tells me that your projection matrix is probably backwards. You likely want to adjust your projection matrix by scaling it by a factor of -1 in the z axis.

Computer graphics: polygon mesh

So a polygon mesh is defined as the following:
class Triangle{
int vertices[3]; //vertex indices
float nx, ny, nz; //face-plane normal
};
Is this a convenient way to represent a mesh used with flat shading? Explain
Suggest an object for which this is a good mesh format when used with Gouraud shading. Explain
Suggest an object for which this is a bad mesh format when used with Gouraud shading. Explain
So for 1, I said yes because the face plane normal can be easily converted to a point in the middle of the face. I read somewhere that normals don't have positions?
For 2 I said a ball; more gentle angles
And 3 a box; steeper angles.
I don't know, I don't think I really understand what the normal vector is.
mostly yes
from geometry computations is this OK however from rendering aspect having triangles in indices form only can be sometimes problematic (depends on the rendering engine, HW, etc). Usually is faster to have the triangle points directly in vector form instead of just indexes sometimes triangle contains both... However that is wasting space.
depends on how you classify what is OK and what not.
smooth objects like sphere will look like this
while flat side meshes like cube will be rendered without visible distortions in shape (but with flat shaded like colors only so lighting will be corrupted)
So answer to this is depend on what you want to achieve less lighting error, or better shape recognition or what. Basically using 1 normal for face will turn Gourard into flat shading.
Lighting can be improved by dividing big flat surfaces into more triangles
is unanswerable exactly for the same reasons as #2
So if you want to answer #2,#3 you need to clarify what it means good and bad ...

Determing the direction of face normals consistently?

I'm a newbie to computer graphics so I apologize if some of my language is inexact or the question misses something basic.
Is it possible to calculate face normals correctly, given a list of vertices, and a list of faces like this:
v1: x_1, y_1, z_1
v2: x_2, y_2, z_2
...
v_n: x_n, y_n, z_n
f1: v1,v2,v3
f2: v4,v2,v5
...
f_m: v_j, v_k, v_l
Each x_i, y_i , z_i specifies the vertices position in 3d space (but isn't neccesarily a vector)
Each f_i contains the indices of the three vertices specifying it.
I understand that you can use the cross product of two sides of a face to get a normal, but the direction of that normal depends on the order and choice of sides (from what I understand).
Given this is the only data I have is it possible to correctly determine the direction of the normals? or is it possible to determine them consistently atleast? (all normals may be pointing in the wrong direction?)
In general there is no way to assign normal "consistently" all over a set of 3d faces... consider as an example the famous Möbius strip...
You will notice that if you start walking on it after one loop you get to the same point but on the opposite side. In other words this strip doesn't have two faces, but only one. If you build such a shape with a strip of triangles of course there's no way to assign normals in a consistent way and you'll necessarily end up having two adjacent triangles with normals pointing in opposite directions.
That said, if your collection of triangles is indeed orientable (i.e. there actually exist a consistent normal assignment) a solution is to start from one triangle and then propagate to neighbors like in a flood-fill algorithm. For example in Python it would look something like:
active = [triangles[0]]
oriented = set([triangles[0]])
while active:
next_active = []
for tri in active:
for other in neighbors(tri):
if other not in oriented:
if not agree(tri, other):
flip(other)
oriented.add(other)
next_active.append(other)
active = next_active
In CG its done by polygon winding rule. That means all the faces are defined so the points are in CW (or CCW) order when looked on the face directly. Then using cross product will lead to consistent normals.
However many meshes out there does not comply the winding rule (some faces are CW others CCW not all the same) and for those its a problem. There are two approaches I know of:
for simple shapes (not too much concave)
the sign of dot product of your face_normal and face_center-cube_center will tell you if the normal points inside or outside of the object.
if ( dot( face_normal , face_center-cube_center ) >= 0.0 ) normal_points_out
You can even use any point of face instead of the face center too. Anyway for more complex concave shapes this will not work correctly.
test if point above face is inside or not
simply displace center of face by some small distance (not too big) in normal direction and then test if the point is inside polygonal mesh or not:
if ( !inside( face_center+0.001*face_normal ) ) normal_points_out
to check if point is inside or not you can use hit test.
However if the normal is used just for lighting computations then its usage is usually inside a dot product. So we can use its abs value instead and that will solve all lighting problems regardless of the normal side. For example:
output_color = face_color * abs(dot(face_normal,light_direction))
some gfx apis have implemented this already (look for double sided materials or normals, turning them on usually use the abs value ...) For example in OpenGL:
glLightModeli(GL_LIGHT_MODEL_TWO_SIDE, GL_TRUE);

3d Graphing Application Questions

For one of my classes, I made a 3D graphing application (using Visual Basic). It takes in a string (z=f(x,y)) as input, parses it into RPN notation, then evaluates and graphs the equation. While it did work, it took about 20 seconds to graph. I would have liked to add slide bars to rotate the graph vertically and horizontally, but it was definitely too slow to allow that.
Does anyone know what programming languages would be best for this type of thing? Ideally, I will be able to smoothly rotate the function once it is graphed.
Also, I’m trying to find a better way to rotate the function. Right now, I evaluate it at a bunch of points, and then plot the points to the screen. Every time it is rotated, it must be re-evaluated and plot all the new points. This takes just as long as the original graph process, as it basically treats it as a completely new function.
Lastly, I need a better way to display the graph. Currently (using VB with visual studio) I plot 200,000 points to a chart, but this does not look great by any means. Eventually, I would like to be able to change color based on height, and other graphics manipulation to make it look better.
To be clear, I am not asking for someone to do any of this for me, but rather the means to go about coding this in an efficient way. I will greatly appreciate any advice anyone can give to help with any of these three concerns.
So I will explain how I would go about it using C++ and OpenGL. This doesn't mean those are the tools that you must use, it's just those are standard graphics tools.
Your function's surface is essentially a 2D manifold, which has the nice property of having an intuitive mapping to a 2D space. What is commonly referred to as UV mapping.
What you should do is pick the ranges for the rectangle domain you want to display (minimum x, maximum x, minimum y, maximum y) And make 2 nested for loops of the form:
// Pseudocode
for (x=minimum; x<maximum; x++)
for (y=minimum; y=maximum; y++)
3D point = (x,y, f(x,y))
Store all of these points into a container (std vector for c++ works fine) and this will be your "mesh".
This is done once, prior to rendering. You then render those points using, for example GL_POINTS, and rotate your graph mesh using rotations on the GPU.
This will only show scattered points, not a surface.
If you also wish to show the surface of your function, and not just the points, you can triangulate that set of points fairly easily.
Group each 4 contiguous vertices (i.e the vertices at indices <x,y>, <x+1,y>, <x+1,y>, <x+1,y+1>) and create the 2 triangles:
(<x,y>, <x+1,y>, <x,y+1>), (<x+1,y>, <x+1,y+1>, <x,y+1>)
This will fill triangulate the surface of your mesh.
Essentially you only need to build your mesh once, and this way rendering should be 60 fps for something with 20 000 vertices, regardless of whether you only render points or triangles too.
Programming language is mostly not relevant, so VB itself is probably not the issue. You can have the same issues in Python, C#, C++, etc. Of course you must master the programming language you choose.
One key aspect is using the right algorithms and data-structures. Proper use of memory allocations and memory layout for maximizing CPU (and GPU) cache are also key. Then you must take advantage of the platform and hardware capabilities (GPU and Multithreading). For the last point you definetely need to use a graphics library such as OpenGL or Vulkan.

Resources