camera in ray tracing - graphics

I’ve completely stuck with camera in ray tracing. Please, take a look at my calculations and point me out where is the error. I’m using left handed coordinate system.
x,y // range [0..S) x [0..S) //pixels coordinates
Now, let’s transform pixels coordinates to parametric coordinates of camera plane:
xp = x/S * 2 – 1;
yp = y/S * 2 – 1;
xp, yp // range [-1..1] x [-1..1]
calculation of camera basis:
//eye - camera position
//up - camera up vector
//look_at - camera target point
vec3 w = normalize(look_at-eye);
vec3 u = cross(up,w);
vec3 v = cross(w,u);
so ray direction should have following coordinates:
vec3 dir = look_at – eye + xp*u + yp*v;
ray3 ray = {eye, normalize(dir)};

I think the mistake is here:
vec3 dir = look_at – eye + xp*u + yp*v;
The image plane should have a normal vector w, and either be between the eye and the look at point (the more common way in ray tracers), or be behind the eye (more closely models an actual pinhole camera). So let's create a scalar zoom_factor. A positive number will put the plane in front of the eye, and a negative one will put it behind the eye (and flip the image).
The center of the image plane is thus:
eye + zoom_factor*w
A point (xp, yp) on the image plane is thus:
eye + zoom_factor*w + xp*u + yp*v
Now you want the direction to be from the eye to this point on this image plane:
vec3 dir = eye + zoom_factor*w + xp*u + yp*v - eye;
The eyes cancel, so it simplifies to:
vec3 dir = zoom_factor*w + xp*u + yp*v
This assumes xp an yp are each in a range like (-0.5, 0.5). Note that (0, 0) is the middle of the image plane with this arrangement.

Related

Pixel space depth offset in vertex shader

I'm trying to draw simple scaled points in my custom graphics engine. The points are scaled in pixel space, and the radius of the points are in pixels, but the position of the points fed to the draw function are in world coordinates.
So far, everything is working great, except for a depth clipping issue. The points are of constant size, regardless of how far away they are, which is done by offsetting the vertices in projected/clip space. However, when they are close to surfaces, they partially intersect them in the depth buffer.
Since these points represent world coordinates, I want them to use the depth buffer, and be hidden behind objects that are in front of them. However, when the point is close to a surface, I want to push it toward the camera, so it doesn't partially intersect it. I think it is easier to just always do this push, regardless of the point being close to a surface. What makes the most sense to me is to just push it by its radius, so that all of its vertices are exactly far enough away to avoid clipping into nearby surfaces.
The easiest way I've found to do this is to simply subtract from the Z value in the vertex shader, after transforming into view-projection space. However, I'm having some trouble converting my pixel radius into a depth offset. Regardless of the math I use, what works close up never seems to work far away. I'm thinking maybe this is due to how the z buffer is non-linear, but could be wrong.
Currently, the closest I've been to solving this is the following:
proj_vertex_pos.z -= point_pixel_radius / proj_vertex_pos.w * 100.0
I'm honestly not sure why 100.0 helps make this work yet. I added it simply because dividing the radius by w was too small of a value. Can anyone point me in the right direction? How do I convert my pixel distance into a depth distance? Especially if the depth distance changes scale depending on which depth you are at? Or am I just way off?
The solution was to convert my pixel space radius into world space units, since the z-buffer is still in world space, even after transforming by the view-projection transform. This can be done by converting pixels into a factor (factor = pixels / screen_size), then convert the factor into world space units, which was a little more involved - I had to calculate the world-space size of the screen at a given distance, then multiply the factor by that to get world units. I can post the related code if anyone needs it. There's probably a simpler way to calculate it, but my brain always goes straight for factors.
The reason I was getting different results at different distances was mainly because I was only offsetting the z component of the clip position by the result. It's also necessary to offset the w component, to make the depth offset work at any distance (linear). However, in order to offset the w component, you first have to scale xy by w, modify w as needed, then divide xy by the new w. This resulted in making the math pretty involved, so I changed the strategy to offset the vertex before clip space, which requires calculating the distance to the camera in Z space manually, but it honestly ended up being about the same amount of math either way.
Here is the final vertex shader at the moment. Hopefully the global values make sense. I did not modify this to post it, so please forgive any sillyness in my comments. EDIT: I had to make some edits to this, because I was accidentally moving the vertex along the camera-Z direction instead of directly toward the camera:
lerpPoint main(vinBake vin)
{
// prepare output
lerpPoint pin;
// extract radius/size from input
pin.InRadius = vin.TexCoord.y;
// compute offset from vertex to camera
float3 to_cam_offset = Scene.CamPos - vin.Position.xyz;
// compute the Z distance of the camera from the vertex
float cam_z_dist = -dot( Scene.CamZ, to_cam_offset );
// compute the radius factor
// + this describes what percentage of the screen is covered by our radius
// + this removes it from pixel space into factor-space
float radius_fac = Scene.InvScreenRes.x * pin.InRadius;
// compute world-space radius by scaling with FieldFactor
// + FieldFactor.x represents the world-space-width of the camera view at whatever distance we scale it by
// + here, we scale FieldFactor.x by the camera z distance, which gives us the world radius, in world units
// + we must multiply by 2 because FieldFactor.x only represents HALF of the screen
float radius_world = radius_fac * Scene.FieldFactor.x * cam_z_dist * 2.0;
// finally, push the vertex toward the camera by the world radius
// + note: moving by radius will only work with surfaces facing the camera, since we are moving toward the camera, rather than away from the surface
// + because of this, we also multiply by another 4, to compensate for nearby surface angles, but there is no scale that would work for every angle
float3 offset = normalize(to_cam_offset) * (radius_world * -4.0);
// generate projected position
// + after this, x=-1 is left, x=+1 is right, y=-1 is bottom, and y=+1 is top of screen
// + note that after this transform, w represents "distance from camera", and z represents "distance from near plane", both in world space
pin.ClipPos = mul( Scene.ViewProj, float4( vin.Position.xyz + offset, 1.0) );
// calculate radius of point, in clip space from our radius factor
// + we scale by 2 to convert pixel radius into clip-radius
float clip_radius = radius_fac * 2.0 * pin.ClipPos.w;
// compute scaled clip-space offset and apply it to our clip-position
// + vin.Prop.xy: -1,-1 = bottom-left, -1,1 = top left, 1,-1 = bottom right, 1,1 = top right (note: in clip-space, +1 = top, -1 = bottom)
// + we scale by clipping depth (part of clip_radius) to retain constant scale, but this will give us a VERY LARGE result
// + we scale by inverter resolution (clip_radius) to convert our input screen scale (eg, 1->1024) into a clip scale (eg, 0.001 to 1.0 )
pin.ClipPos.x += vin.Prop.x * clip_radius;
pin.ClipPos.y += vin.Prop.y * clip_radius * Scene.Aspect;
// return result
return pin;
}
Here is the other version that offsets z & w instead of changing things in world space. After edits above, this is probably the more optimal solution:
lerpPoint main(vinBake vin)
{
// prepare output
lerpPoint pin;
// extract radius/size from input
pin.InRadius = vin.TexCoord.y;
// generate projected position
// + after this, x=-1 is left, x=+1 is right, y=-1 is bottom, and y=+1 is top of screen
// + note that after this transform, w represents "distance from camera", and z represents "distance from near plane", both in world space
pin.ClipPos = mul( Scene.ViewProj, float4( vin.Position.xyz, 1.0) );
// compute the radius factor
// + this describes what percentage of the screen is covered by our radius
// + this removes it from pixel space into factor-space
float radius_fac = Scene.InvScreenRes.x * pin.InRadius;
// compute world-space radius by scaling with FieldFactor
// + FieldFactor.x represents the world-space-width of the camera view at whatever distance we scale it by
// + here, we scale FieldFactor.x by the camera z distance, which gives us the world radius, in world units
// + we must multiply by 2 because FieldFactor.x only represents HALF of the screen
float radius_world = radius_fac * Scene.FieldFactor.x * pin.ClipPos.w * 2.0;
// offset depth by our world radius
// + we scale this extra to compensate for surfaces with high angles relative to the camera (since we are moving directly at it)
// + notice we have to make the perspective divide before modifying w, then re-apply it after, or xy will be off
pin.ClipPos.xy /= pin.ClipPos.w;
pin.ClipPos.z -= radius_world * 10.0;
pin.ClipPos.w -= radius_world * 10.0;
pin.ClipPos.xy *= pin.ClipPos.w;
// calculate radius of point, in clip space from our radius factor
// + we scale by 2 to convert pixel radius into clip-radius
float clip_radius = radius_fac * 2.0 * pin.ClipPos.w;
// compute scaled clip-space offset and apply it to our clip-position
// + vin.Prop.xy: -1,-1 = bottom-left, -1,1 = top left, 1,-1 = bottom right, 1,1 = top right (note: in clip-space, +1 = top, -1 = bottom)
// + we scale by clipping depth (part of clip_radius) to retain constant scale, but this will give us a VERY LARGE result
// + we scale by inverter resolution (clip_radius) to convert our input screen scale (eg, 1->1024) into a clip scale (eg, 0.001 to 1.0 )
pin.ClipPos.x += vin.Prop.x * clip_radius;
pin.ClipPos.y += vin.Prop.y * clip_radius * Scene.Aspect;
// return result
return pin;
}

What is the focal length and image plane distance from this raytracing formula

I have a 4x4 camera matrix comprised of right, up, forward and position vectors.
I raytrace the scene with the following code that I found in a tutorial but don't really entirely understand it:
for (int i = 0; i < m_imageSize.width; ++i)
{
for (int j = 0; j < m_imageSize.height; ++j)
{
u = (i + .5f) / (float)(m_imageSize.width - 1) - .5f;
v = (m_imageSize.height - 1 - j + .5f) / (float)(m_imageSize.height - 1) - .5f;
Ray ray(cameraPosition, normalize(u*cameraRight + v*cameraUp + 1 / tanf(m_verticalFovAngleRadian) *cameraForward));
I have a couple of questions:
How can I find the focal length of my raytracing camera?
Where is my image plane?
Why cameraForward needs to be multiplied with this 1/tanf(m_verticalFovAngleRadian)?
Focal length is a property of lens systems. The camera model that this code uses, however, is a pinhole camera, which does not use lenses at all. So, strictly speaking, the camera does not really have a focal length. The corresponding optical properties are instead expressed as the field of view (the angle that the camera can observe; usually the vertical one). You could calculate the focal length of a camera that has an equivalent field of view with the following formula (see Wikipedia):
FOV = 2 * arctan (x / 2f)
FOV diagonal field of view
x diagonal of film; by convention 24x36 mm -> x=43.266 mm
f focal length
There is no unique image plane. Any plane that is perpendicular to the view direction can be seen as the image plane. In fact, the projected images differ only in their scale.
For your last question, let's take a closer look at the code:
u = (i + .5f) / (float)(m_imageSize.width - 1) - .5f;
v = (m_imageSize.height - 1 - j + .5f) / (float)(m_imageSize.height - 1) - .5f;
These formulas calculate u/v coordinates between -0.5 and 0.5 for every pixel, assuming that the entire image fits in the box between -0.5 and 0.5.
u*cameraRight + v*cameraUp
... is just placing the x/y coordinates of the ray on the pixel.
... + 1 / tanf(m_verticalFovAngleRadian) *cameraForward
... is defining the depth component of the ray and ultimately the depth of the image plane you are using. Basically, this is making the ray steeper or shallower. Assume that you have a very small field of view, then 1/tan(fov) is a very large number. So, the image plane is very far away, which produces exactly this small field of view (when keeping the size of the image plane constant since you already set the x/y components). On the other hand, if the field of view is large, the image plane moves closer. Note that this notion of image plane is only conceptual. As I said, all other image planes are equally valid and would produce the same image. Another way (and maybe a more intuitive one) to specify the ray would be
u * tanf(m_verticalFovAngleRadian) * cameraRight
+ v * tanf(m_verticalFovAngleRadian) * cameraUp
+ 1 * cameraForward));
As you see, this is exactly the same ray (just scaled). The idea here is to set the conceptual image plane to a depth of 1 and scale the x/y components to adapt the size of the image plane. tan(fov) (with fov being the half field of view) is exactly the size of the half image plane at a depth of 1. Just draw a triangle to verify that. Note that this code is only able to produce square image planes. If you want to allow rectangular ones, you need to take into account the ratio of the side lengths.

Equation for the ray with parallel projection

What will be the equation for the ray and ray origin when we are using parallel projection and how to derive that?
In traditional raytracing, you use a ray that starts at your eye point. For each pixel you calculate where it is on a virtual screen in front of the camera and shoot a ray through that pixel.
Let pO be the eye point, d be the direction of the camera, r to be a vector pointing to the right and u to be a vector pointing up. Let w be the number of pixels in the screen horizontally and h be the number of pixels vertically.
The parametric equation for a ray going through any pixel x, y is then:
ray = pO + t * normalize (d + (x - 0.5w)/0.5w * r + (y - 0.5h)/0.5h * u)
where t is the parameter.
For a parallel projection, move the virtual screen to the origin and calculate the x, y to be the origin of the ray then use the same direction d for each ray:
ray = (pO + (x - 0.5w)/0.5w * r + (y - 0.5h)/0.5h * u) + t*d
For a perspective projection, you have an eye origin, direction, right and up vectors. You then run a vector from the eye origin to each pixel in a virtual screen by scaling the right and up vectors.
In a parallel projection, you do the same calculation for the point on the screen, but your origin becomes that point and you use the same direction for each ray.

How to project a point on to a sphere

If i have a point (x,y,z) how to project it on to a sphere(x0,y0,z0,radius) (on its surface).
My input will be the coordinates of point and sphere.
The output should be the coordinates of the projected point on sphere.
Just convert from cartesian to spherical coordinates?
For the simplest projection (along the line connecting the point to the center of the sphere):
Write the point in a coordinate system centered at the center of the sphere (x0,y0,z0):
P = (x',y',z') = (x - x0, y - y0, z - z0)
Compute the length of this vector:
|P| = sqrt(x'^2 + y'^2 + z'^2)
Scale the vector so that it has length equal to the radius of the sphere:
Q = (radius/|P|)*P
And change back to your original coordinate system to get the projection:
R = Q + (x0,y0,z0)
Basically you want to construct a line going through the spheres centre and the point. Then you intersect this line with the sphere and you have your projection point.
In greater detail:
Let p be the point, s the sphere's centre and r the radius then x = s + r*(p-s)/(norm(p-s)) where x is the point you are looking for. The implementation is left to you.
I agree that the spherical coordinate approach will work as well but is computationally more demanding. In the above formula the only non-trivial operation is the square root for the norm.
It works if you set the coordinates of the center of the sphere as origin of the system (x0, y0, z0). So you will have the coordinates of the point referred to that origin (Xp', Yp', Zp'), and converting the coordinates to polar, you discard the radius (distance between the center of the sphere and the point) and the angles will define the projection.

Ray Generation Inconsistency

I have written code that generates a ray from the "eye" of the camera to the viewing plane some distance away from the camera's eye:
R3Ray ConstructRayThroughPixel(...)
{
R3Point p;
double increments_x = (lr.X() - ul.X())/(double)width;
double increments_y = (ul.Y() - lr.Y())/(double)height;
p.SetX( ul.X() + ((double)i_pos+0.5)*increments_x );
p.SetY( lr.Y() + ((double)j_pos+0.5)*increments_y );
p.SetZ( lr.Z() );
R3Vector v = p-camera_pos;
R3Ray new_ray(camera_pos,v);
return new_ray;
}
ul is the upper left corner of the viewing plane and lr is the lower left corner of the viewing plane. They are defined as follows:
R3Point org = scene->camera.eye + scene->camera.towards * radius;
R3Vector dx = scene->camera.right * radius * tan(scene->camera.xfov);
R3Vector dy = scene->camera.up * radius * tan(scene->camera.yfov);
R3Point lr = org + dx - dy;
R3Point ul = org - dx + dy;
Here, org is the center of the viewing plane with radius being the distance between the viewing plane and the camera eye, dx and dy are the displacements in the x and y directions from the center of the viewing plane.
The ConstructRayThroughPixel(...) function works perfectly for a camera whose eye is at (0,0,0). However, when the camera is at some different position, not all needed rays are produced for the image.
Any suggestions what could be going wrong? Maybe something wrong with my equations?
Thanks for the help.
Here's a quibble that may have nothing to do with you problem:
When you do this:
R3Vector dx = scene->camera.right * radius * tan(scene->camera.xfov);
R3Vector dy = scene->camera.up * radius * tan(scene->camera.yfov);
I assume that the right and up vectors are normalized, right? In that case you want sin not tan. Of course, if the fov angles are small it won't make much difference.
The reason why my code wasn't working was because I was treating x,y,z values separately. This is wrong, since the camera can be facing in any direction and thus if it was facing down the x-axis, the x coordinates would be the same, producing increments of 0 (which is incorrect). Instead, what should be done is an interpolation of corner points (where points have x,y,z coordinates). Please see answer in related post: 3D coordinate of 2D point given camera and view plane

Resources