Intro.
Previously, I've asked a question on converting rgb triple to quaternion. After that question I've managed to get unit quaternions, but I am in doubt of their internal structure. There was no easy way to operate them, and to separate luma and chroma, since that were quaternions of unit length. According to my feeling about it, luminance should be encoded in the either real part, or a whole magnitude; And color "chroma" information should be encoded in the imaginary part.
Today I've decided to improve things up, taking another approach, different from the first one in the link above. I think it could success, since quaternion could store not only rotation(unit quaternion), but scale as well. First things first, so I'll start with explaining my next idea. I would use GLSL shader syntax in the following explanations.
Approach description and the question body.
For some pixel of an image, let's concieve a 3D vector vec3 u within the unit cube, where positive coordinates are lying in closed range [0.0, 1.0], and are representing full rgb colorspace. So now u's coordinates , u.x, u.y and u.z would represent red, green and blue values of that pixel, accordingly. Then lets take a pure white vector const vec3 v = vec3(1.0, 1.0, 1.0);. And let's define some quaternion q, so that our vector u is the "v, rotated and scaled with quaternion q". In simple words, q must answer the question "How to transform v, in order to get initially conceived color u?". And lets introduce function for that "rotate and scale" operation: vec3 q2c(in vec4 q, in vec3 v). I'll call it "quaternion-to-color" converter.
Writing q2c(q, v) is pretty simple, just as defined: q2c(q, v) == (q*vec4(v, 0.0))*q'. Here, the "*" operator denotes quaternion multiplication; Lets make it a function vec4 qmul(in vec4 q1, in vec4 q2). And "q'" denotes q's conjugate, lets make it vec4 qconj(in vec4 q). Omitting their simple implementation (that you may find in full source), we would come to classic code:
vec4 q2c(in vec4 q, in vec3 v) {
return qmul(qmul(q, vec4(v, 0.0)), qconj(q));
}
So now we have q2c(q,v) function, that converts quaternion q to color, by rotating and scaling some chosen 3D vector v.
The question is How to find that quaternion q?
From a programmer's perspective, the goal is To write reverse function vec4 c2q(in vec3 u, in vec3 v) - a corresponding "color to quaternion" converter.
Please note, that you should not touch q2c(), without a really good reason. E.g, a serious bug in its logic, leading to "impossibility to solve task", and you can proof that.
How could you check, if your answer is correct?
Indeed, the checking method would arise from the fact that you would get initial value, if you will manage to convert forth and back. So the checking condition is For any non-zero length v, u must always be equal to q2c(c2q(u, v), v). v must have non-zero length, because one cannot "scale zero" to get "something".
To easy things up, I've prepared testing program, using shadertoy.com service.
You would require a decent computer, with working internet connection and a web browser with webGL support (I'm using Chrome). Program should work on any GPU, even embedded into intel's processors. It even worked on my lower-end smartphone!
To test your answer, you should put your proposed formula, written in GLSL syntax, inside c2q() function. Then press apply button, and your changes will come into effect:
Image at the left represents some unchanged source pixels. And right half will contain pixels, transformed forth and back by q2c(c2q()). Obviously, halves must be visually equal, you should not notice any vertical line. An some little mathematical(unnoticeable) error may arise, but only due to floating point's nature - its finite precision and possible rounding errors.
Feel free to edit and experiment, changes will be done only locally, on your computer, and you cannot wreck anything. If video is not playing on first open (shadertoy bug) - try to pause/unpause it. Enjoy!
Hall of c2q() Attempts
If everything is correct, the right side of image(processed one) should be equal to the left side(original). And here I would review different results, that were obtained by putting something instead of xxxxx, in the c2q() implementation:
vec4 c2q(vec3 u, vec3 v) {
return xxxxx;
}
Lets proceed!
Initially I've thought that must just work:
vec4(cross(u, v), dot(u, v)):
One of SE answers:
vec4( cross(u, v), sqrt( dot(u, u) * dot(v, v) ) + dot(u, v) ):
And with his hint "Don't forget to normalize q":
normalize(vec4( cross(u, v), sqrt( dot(u, u) * dot(v, v) ) + dot(u, v) )):
#minorlogic's comment, seems to be a step closer:
scale all q's components by sqrt( length(v)/length(u) ),
vec4(cross(u, v), dot(u, v)) * sqrt( length(u)/length(v) ):
With ratio swapped:
vec4(cross(u, v), dot(u, v)) * sqrt( length(v)/length(u) ):
My attempt:
vec4 c2q(vec3 u, vec3 v) {
float norm_q = sqrt(length(u) / length(v));
vec4 u4 = vec4(normalize(u), 0.0);
vec4 v4 = vec4(normalize(v), 0.0);
return norm_q * (qmul(u4, v4 + u4) / length(v4 + u4));
}
Related
Does the technique that vulkan uses (and I assume other graphics libraries too) to interpolate vertex attributes in a perspective-correct manner require that the vertex shader must normalize the homogenous camera-space vertex position (ie: divide through by the w-coordinate such that the w-coordinate is 1.0) prior to multiplication by a typical projection matrix of the form...
g/s 0 0 0
0 g 0 n
0 0 f/(f-n) -nf/(f-n)
0 0 1 0
...in order for perspective-correctness to work properly?
Or, will perspective-correctness continue to work on any homogeneous vertex position in camera-space (with a w-coordinate other than 1.0)?
(I didn't completely follow the perspective-correctness math, so it is unclear which to me which is the case.)
Update:
In order to clarify terminology:
vec4 modelCoordinates = vec4(x_in, y_in, z_in, 1);
mat4 modelToWorld = ...;
vec4 worldCoordinates = modelToWorld * modelCoordinates;
mat4 worldToCamera = ...;
vec4 cameraCoordinates = worldToCamera * worldCoordinates;
mat4 cameraToProjection = ...;
vec4 clipCoordinates = cameraToProjection * cameraCoordinates;
output(clipCoordinates);
cameraToProjection is a matrix like the one shown in the question
The question is does cameraCoordinates.w have to be 1.0?
And consequently the last row of both the modelToWorld and worldToCamera matricies have to be 0 0 0 1?
You have this exactly backwards. Doing the perspective divide in the shader is what prevents perspective-correct interpolation. The rasterizer needs the perspective information provided by the W component to do its job. With a W of 1, the interpolation is done in window space, without any regard to perspective.
Provide a clip-space coordinate to the output of your vertex processing stage, and let the system do what it exists to do.
the vertex shader must normalize the homogenous camera-space vertex position (ie: divide through by the w-coordinate such that the w-coordinate is 1.0) prior to multiplication by a typical projection matrix of the form...
If your camera-space vertex position does not have a W of 1.0, then one of two things has happened:
You are deliberately operating in a post-projection world space or some similar construct. This is a perfectly valid thing to do, and the math for a camera space can be perfectly reasonable.
Your code is broken somewhere. That is, you intend for your world and camera space to be a normal, Euclidean, non-homogeneous space, but somehow the math didn't work out. Obviously, this is not a perfectly valid thing to do.
In both cases, dividing by W is the wrong thing to do. If your world space that you're placing a camera into is post-projection (such as in this example), dividing by W will break your perspective-correct interpolation, as outlined above. If your code is broken, dividing by W will merely mask the actual problem; better to fix your code than to hide the bug, as it may crop up elsewhere.
To see whether or not the camera coordinates need to be in normal form, let's represent the camera coordinates as multiples of w, so they are (wx,wy,wz,w).
Multiplying through by the given projection matrix, we get the clip coordinates (wxg/s, wyg, fwz/(f-n)-nfw/(f-n)), wz)
Calculating the x-y framebuffer coordinates as per the fixed Vulkan formula we get (P_x * xg/sz +O_x, P_y * Hgy/z + O_y). Notice this does not depend on w, so the position in the framebuffer of a polygons verticies doesn't require the camera coordinates be in normal form.
Likewise calculation of the barycentric coordinates of fragments within a polygon only depends on x,y in framebuffer coordinates, and so is also independant of w.
However perspective-correct perspective interpolation of fragment attributes does depend on W_clip of the verticies as this is used in the formula given in the Vulkan spec. As shown above W_clip is wz which does depend on w and scales with it, so we can conclude that camera coordinates must be in normal form (their w must be 1.0)
Question:
I need to calculate intersection shape (purple) of plane defined by Ax + By + Cz + D = 0 and frustum defined by 4 rays emitting from corners of rectangle (red arrows). The result shoud be quadrilateral (4 points) and important requirement is that result shape must be in plane's local space. Plane is created with transformation matrix T (planes' normal is vec3(0, 0, 1) in T's space).
Explanation:
This is perspective form of my rectangle projection to another space (transformation / matrix / node). I am able to calculate intersection shape of any rectangle without perspective rays (all rays are parallel) by plane-line intersection algorithm (pseudocode):
Definitions:
// Plane defined by normal (A, B, C) and D
struct Plane { vec3 n; float d; };
// Line defined by 2 points
struct Line { vec3 a, b; };
Intersection:
vec3 PlaneLineIntersection(Plane plane, Line line) {
vec3 ba = normalize(line.b, line.a);
float dotA = dot(plane.n, l.a);
float dotBA = dot(plane.n, ba);
float t = (plane.d - dotA) / dotBA;
return line.a + ba * t;
}
Perspective form comes with some problems, because some of rays could be parallel with plane (intersection point is in infinite) or final shape is self-intersecting. Its works in some cases, but it's not enough for arbitary transformation. How to get correct intersection part of plane wtih perspective?
Simply, I need to get visible part of arbitary plane by arbitary perspective "camera".
Thank you for suggestions.
Intersection between a plane (one Ax+By+Cx+D equation) and a line (two planes equations) is a matter of solving the 3x3 matrix for x,y,z.
Doing all calculations on T-space (origin is at the top of the pyramid) is easier as some A,B,C are 0.
What I don't know if you are aware of is that perspective is a kind of projection that distorts the z ("depth", far from the origin). So if the plane that contains the rectangle is not perpendicular to the axis of the fustrum (z-axis) then it's not a rectangle when projected into the plane, but a trapezoid.
Anyhow, using the projection perspective matrix you can get projected coordinates for the four rectangle corners.
To tell if a point is in one side of a plane or in the other just put the point coordinates in the plane equation and get the sign, as shown here
Your question seems inherently mathematic so excuse my mathematical solution on StackOverflow. If your four arrows emit from a single point and the formed side planes share a common angle, then you are looking for a solution to the frustum projection problem. Your requirements simplify the problem quite a bit because you define the plane with a normal, not two bounded vectors, thus if you agree to the definitions...
then I can provide you with the mathematical solution here (Internet Explorer .mht file, possibly requiring modern Windows OS). If you are thinking about an actual implementation then I can only direct you to a very similar frustum projection implementation that I have implemented/uploaded here (Lua): https://github.com/quiret/mta_lua_3d_math
The roadmap for the implementation could be as follows: creation of condition container classes for all sub-problems (0 < k1*a1 + k2, etc) plus the and/or chains, writing algorithms for the comparisions across and-chains as well as normal-form creation, optimization of object construction/memory allocation. Since each check for frustum intersection requires just a fixed amount of algebraic objects you can implement an efficient cache.
I am just starting the opengl study, I am going to post a segment of code, and explains my understanding, could you follow my explanation and point out any problems?
glMatrixMode(GL_MODELVIEW);
glLoadIdentity(); //start with an identity matrix
glPushMatrix();
//push the current identity matrix onto the stack, and start with a new identity matrix as
the transformation matrix
glPushMatrix();
//copy the current matrix which is the identity as the new transformation matrix and then push the current transformation matrix onto stack
glScalef(10, 10, 1.0);
**Question 1**
//I feels like the order which the image is built is kinda reversed
//It's like drawSquare happens first, then color, then scale
//can anyone clarify?
//Second, the drawsquare defines the 4 vertices around the origin(+/-0.5,+/-0.5)
//is the origin located at the center of the window by default?
//what happen when it is scaled? does point (0.5,0.5) scaled to (5,5)?
glColor3f(0.0, 1.0, 0.0);
drawSquare(1.0);
glPopMatrix();
//forget the current transformation matrix, pop out the one on top of the stack
//which is the identity matrix
//In the code below:
//my understanding is 4 vertices is defined around origin, but is this the same origin?
//then the unit square is moved just below the X-axis
//and the 4 vertices are scaled one by one?
//ex: (0.5,0) -> (1,0) (0.5,-1) -> (1,-2)
glScalef(2, 2, 1.0);
glTranslatef(0.0, -0.5, 0.0);
glColor3f(1.0, 0.0, 0.0);
drawSquare(1.0);
//last question, if I want to make the second square rotate at a point along z-axis
//do I have to specify it here?
//for example: add glRotatef(rotate_degree, 0.0, 0.0, 1.0); above the glScalef(2, 2, 1.0);
//such that later i can change the value in the rotate_degree?
glPopMatrix(); //forget about the current transformation matrix, pop out the top matrix on the stack.
That the order of operations seems inverted comes from the fact that matrices are non-commutative and right associative when multiplied with column vectors. Say you have a position column vector ↑p in model space. To bring it into world space you multiply it with matrix M, i.e
↑p_world = M · ↑p
Note that you can not change the order of operations! Column vectors match like a key into matrices and the key fits into the matrix-lock from the right.
In the next step you want to transform into view space, using matrix V so you write
↑p_view = V · ↑p_world
but this you can substitute with
↑p_view = V · M · ↑p
But of course if you have a lot of ↑p-s you'd want to save on computations, so you contract those two matrices M and V into a single matrix you call modelview. And when you build modelview with OpenGL you build it like this:
MV = 1
MV = MV · V
MV = MV · M
Due to the right associativity of column order matrix multiplication the first transformation applied to a vector is the last one multiplied onto the stack.
Note that by using row order matrix math, things become left associative, i.e. things happen in the order you write them. But column order right associativity is incredibly usefull, as it makes building branching transformation hierachies much, much easier.
I'm trying to find the best way to get the most distant point of a circle from a specified point in 2D space. What I have found so far, is how to get the distance between the point and the circle position, but I'm not entirely sure how to expand this to find the most distant point of the circle.
The known variables are:
Point a
Point b (circle position)
Radius r (circle radius)
To find the distance between the point and the circle position, I have found this:
xd = x2 - x1
yd = y2 - y1
Distance = SquareRoot(xd * xd + yd * yd)
It seems to me, this is part of the solution. How would this be expanded to get the position of Point x in the below image?
As an additional but optional part of the question: I have read in some places that it would be possible to get the distance portion without using the Square Root, which is very performance intensive and should be avoided if fast code is necessary. In my case, I would be doing this calculation quite often; Any comments on this within the context of the main question would be welcome too.
What about this?
Calculate A-B.
We now have a vector pointing from the center of the circle towards A (if B is the origin, skip this and just consider point A a vector).
Normalize.
Now we have a well defined length (the length is 1)
If the circle is not of unit radius, multiply by radius. If it is unit radius, skip this.
Now we have the correct length.
Invert sign (can be done in one step with 3., just multiply with the negative radius)
Now our vector points in the correct direction.
Add B (if B is the origin, skip this).
Now our vector is offset correctly so its endpoint is the point we want.
(Alternatively, you could calculate B-A to save the negation, but then you have to do one more operation to offset the origin correctly.)
By the way, it works the same in 3D, except the circle would be a sphere, and the vectors would have 3 components (or 4, if you use homogenous coords, in this case remember -- for correctness -- setting w to 0 when "turning points into vectors" and to 1 at the end when making a point from the vector).
EDIT:
(in reply of pseudocode)
Assuming you have a vec2 class which is a struct of two float numbers with operators for vector subtraction and scalar multiplicaion (pretty trivial, around a dozen lines of code) and a function normalize which needs to be no more than a shorthand for multiplying with inv_sqrt(x*x+y*y), the pseudocode (my pseudocode here is something like a C++/GLSL mix) could look something like this:
vec2 most_distant_on_circle(vec2 const& B, float r, vec2 const& A)
{
vec2 P(A - B);
normalize(P);
return -r * P + B;
}
Most math libraries that you'd use should have all of these functions and types built-in. HLSL and GLSL have them as first type primitives and intrinsic functions. Some GPUs even have a dedicated normalize instruction.
I'm trying to render the "mount" scene from Eric Haines' Standard Procedural Database (SPD), but the refraction part just doesn't want to co-operate. I've tried everything I can think of to fix it.
This one is my render (with Watt's formula):
(source: philosoraptor.co.za)
This is my render using the "normal" formula:
(source: philosoraptor.co.za)
And this one is the correct render:
(source: philosoraptor.co.za)
As you can see, there are only a couple of errors, mostly around the poles of the spheres. This makes me think that refraction, or some precision error is to blame.
Please note that there are actually 4 spheres in the scene, their NFF definitions (s x_coord y_coord z_coord radius) are:
s -0.8 0.8 1.20821 0.17
s -0.661196 0.661196 0.930598 0.17
s -0.749194 0.98961 0.930598 0.17
s -0.98961 0.749194 0.930598 0.17
That is, there is a fourth sphere behind the more obvious three in the foreground. It can be seen in the gap left between these three spheres.
Here is a picture of that fourth sphere alone:
(source: philosoraptor.co.za)
And here is a picture of the first sphere alone:
(source: philosoraptor.co.za)
You'll notice that many of the oddities present in both my version and the correct version is missing. We can conclude that these effects are the result of interactions between the spheres, the question is which interactions?
What am I doing wrong? Below are some of the potential errors I've already considered:
Refraction vector formula.
As far as I can tell, this is correct. It's the same formula used by several websites and I verified the derivation personally. Here's how I calculate it:
double sinI2 = eta * eta * (1.0f - cosI * cosI);
Vector transmit = (v * eta) + (n * (eta * cosI - sqrt(1.0f - sinI2)));
transmit = transmit.normalise();
I found an alternate formula in 3D Computer Graphics, 3rd Ed by Alan Watt. It gives a closer approximation to the correct image:
double etaSq = eta * eta;
double sinI2 = etaSq * (1.0f - cosI * cosI);
Vector transmit = (v * eta) + (n * (eta * cosI - (sqrt(1.0f - sinI2) / etaSq)));
transmit = transmit.normalise();
The only difference is that I'm dividing by eta^2 at the end.
Total internal reflection.
I tested for this, using the following conditional before the rest of my intersection code:
if (sinI2 <= 1)
Calculation of eta.
I use a stack-like approach for this problem:
/* Entering object. */
if (r.normal.dot(r.dir) < 0)
{
double eta1 = r.iorStack.back();
double eta2 = m.ior;
eta = eta1 / eta2;
r.iorStack.push_back(eta2);
}
/* Exiting object. */
else
{
double eta1 = r.iorStack.back();
r.iorStack.pop_back();
double eta2 = r.iorStack.back();
eta = eta1 / eta2;
}
As you can see, this stores the previous objects that contained this ray in a stack. When exiting the code pops the current IOR off the stack and uses that, along with the IOR under it to compute eta. As far as I know this is the most correct way to do it.
This works for nested transmitting objects. However, it breaks down for intersecting transmitting objects. The problem here is that you need to define the IOR for the intersection independently, which the NFF file format does not do. It's unclear then, what the "correct" course of action is.
Moving the new ray's origin.
The new ray's origin has to be moved slightly along the transmitted path so that it doesn't intersect at the same point as the previous one.
p = r.intersection + transmit * 0.0001f;
p += transmit * 0.01f;
I've tried making this value smaller (0.001f) and (0.0001f) but that makes the spheres appear solid. I guess these values don't move the rays far enough away from the previous intersection point.
EDIT: The problem here was that the reflection code was doing the same thing. So when an object is reflective as well as refractive then the origin of the ray ends up in completely the wrong place.
Amount of ray bounces.
I've artificially limited the amount of ray bounces to 4. I tested raising this limit to 10, but that didn't fix the problem.
Normals.
I'm pretty sure I'm calculating the normals of the spheres correctly. I take the intersection point, subtract the centre of the sphere and divide by the radius.
Just a guess based on doing a image diff (and without reading the rest of your question). The problem looks to me to be the refraction on the back side of the sphere. You might be:
doing it backwards: e.g. reversing (or not reversing) the indexes of refraction.
missing it entirely?
One way to check for this would be to look at the mount through a cube that is almost facing the camera. If the refraction is correct, the picture should be offset slightly but otherwise un-altered. If it's not right, then the picture will seem slightly tilted.
So after more than I year, I finally figured out what was going on here. Clear minds and all that. I was completely off track with the formula. I'm instead using a formula by Heckbert now, which I am sure is correct because I proved it myself using geometry and discrete math.
Here's the correct vector calculation:
double c1 = v.dot(n) * -1;
double c1Sq = pow(c1, 2);
/* Heckbert's formula requires eta to be eta2 / eta1, so I have to flip it here. */
eta = 1 / eta;
double etaSq = pow(eta, 2);
if (etaSq + c1Sq >= 1)
{
Vector transmit = (v / eta) + (n / eta) * (c1 - sqrt(etaSq - 1 + c1Sq));
transmit = transmit.normalise();
...
}
else
{
/* Total internal reflection. */
}
In the code above, eta is eta1 (the IOR of the surface from which the ray is coming) over eta2 (the IOR of the destination surface), v is the incident ray and n is the normal.
There was another problem, which confused the problem some more. I had to flip the normal when exiting an object (which is obvious - I missed it because the other errors were obscuring it).
Lastly, my line of sight algorithm (to determine whether a surface is illuminated by a point light source) was not properly passing through transparent surfaces.
So now my images line up properly :)