HLSL mul and D3DXMATRIX order mismatch - direct3d

I'm trying to multiply the transformation matrix in shader with vectors directly without doing unnecessary transportation. According to HLSL's mul documentation:
mul(x, y) Multiplies x and y using matrix math. The inner dimension x-columns and y-rows must be equal.
x [in] The x input value. If x is a vector, it treated as a row
vector.
y [in] The y input value. If y is a vector, it treated as a column
vector.
I have in the C++ code:
const D3DXMATRIX viewProjection = view * projection;
...
const D3DXMATRIX modelViewProjection = model * viewProjection;
where modelViewProjection is row-major order matrix that is copied to a constant buffer, not transposed. However, for this to work in the HLSL i need to multiply the transformation matrix with the position vector as:
output.position = mul(transformation, position);
which is the opposite of what the mul documentation is saying.
Can someone explain where is the mismatch here?

The deprecated D3DXMath library and the more modern DirectXMath use row-major matrix order. The HLSL language defaults to using column-major matrix order as it's slightly more efficient for multiplies. Therefore, most use of setting constant buffer constants will transpose matrix data. In almost all cases, any 'cost' of transposing the matrix here is completely hidden by all the other latencies in the system.
You can of course tell HLSL to use row-major matrix order instead, which means the HLSL mul needs to do an extra instruction on every vertex which is why it's usually worth doing the transpose on the CPU once per update instead.
See MSDN

Related

What's the different between using modelViewmatrix directly and using normalMatrix instead? [duplicate]

I am working on some shaders, and I need to transform normals.
I read in few tutorials the way you transform normals is you multiply them with the transpose of the inverse of the modelview matrix. But I can't find explanation of why is that so, and what is the logic behind that?
It flows from the definition of a normal.
Suppose you have the normal, N, and a vector, V, a tangent vector at the same position on the object as the normal. Then by definition N·V = 0.
Tangent vectors run in the same direction as the surface of an object. So if your surface is planar then the tangent is the difference between two identifiable points on the object. So if V = Q - R where Q and R are points on the surface then if you transform the object by B:
V' = BQ - BR
= B(Q - R)
= BV
The same logic applies for non-planar surfaces by considering limits.
In this case suppose you intend to transform the model by the matrix B. So B will be applied to the geometry. Then to figure out what to do to the normals you need to solve for the matrix, A so that:
(AN)·(BV) = 0
Turning that into a row versus column thing to eliminate the explicit dot product:
[tranpose(AN)](BV) = 0
Pull the transpose outside, eliminate the brackets:
transpose(N)*transpose(A)*B*V = 0
So that's "the transpose of the normal" [product with] "the transpose of the known transformation matrix" [product with] "the transformation we're solving for" [product with] "the vector on the surface of the model" = 0
But we started by stating that transpose(N)*V = 0, since that's the same as saying that N·V = 0. So to satisfy our constraints we need the middle part of the expression — transpose(A)*B — to go away.
Hence we can conclude that:
transpose(A)*B = identity
=> transpose(A) = identity*inverse(B)
=> transpose(A) = inverse(B)
=> A = transpose(inverse(B))
My favorite proof is below where N is the normal and V is a tangent vector. Since they are perpendicular their dot product is zero. M is any 3x3 invertible transformation (M-1 * M = I). N' and V' are the vectors transformed by M.
To get some intuition, consider the shear transformation below.
Note that this does not apply to tangent vectors.
Take a look at this tutorial:
https://paroj.github.io/gltut/Illumination/Tut09%20Normal%20Transformation.html
You can imagine that when the surface of a sphere stretches (so the sphere is scaled along one axis or something similar) the normals of that surface will all 'bend' towards each other. It turns out you need to invert the scale applied to the normals to achieve this. This is the same as transforming with the Inverse Transpose Matrix. The link above shows how to derive the inverse transpose matrix from this.
Also note that when the scale is uniform, you can simply pass the original matrix as normal matrix. Imagine the same sphere being scaled uniformly along all axes, the surface will not stretch or bend, nor will the normals.
If the model matrix is made of translation, rotation and scale, you don't need to do inverse transpose to calculate normal matrix. Simply divide the normal by squared scale and multiply by model matrix and we are done. You can extend that to any matrix with perpendicular axes, just calculate squared scale for each axes of the matrix you are using instead.
I wrote the details in my blog: https://lxjk.github.io/2017/10/01/Stop-Using-Normal-Matrix.html
Don't understand why you just don't zero out the 4th element of the direction vector before multiplying with the model matrix. No inverse or transpose needed. Think of the direction vector as the difference between two points. Move the two points with the rest of the model - they are still in the same relative position to the model. Take the difference between the two points to get the new direction, and the 4th element, cancels out to zero. Lot cheaper.

How do you pack a 3-floats (space vector) into 4 bytes (pixel)?

I've successfully packed floats with values in [0,1] without losing too much precision using:
byte packedVal = floatVal * 255.0f ; // [0,1] -> [0,255]
Then when I want to unpack the packedVal back into a float, I simply do
float unpacked = packedVal / 255.0f ; // [0,255] -> [0,1]
That works fine, as long as the floats are between 0 and 1.
Now here's the real deal. I'm trying to turn a 3d space vector (with 3 float components) into 4 bytes. The reason I'm doing this is because I am using a texture to store these vectors, with 1 pixel per vector. It should be something like a "normal map", (but not exactly this, you'll see why after the jump)
So there, each pixel represents a 3d space vector. Where the value is very red, the normal vector's direction is mostly +x (to the right).
So of course, normals are normalized. So they don't require a magnitude (scaling) vector. But I'm trying to store a vector with arbitrary magnitude, 1 vector per pixel.
Because textures have 4 components (rgba), I am thinking of storing a scaling vector in the w component.
Any other suggestions for packing an arbitrary sized 3 space vector, (say with upper limit on magnitude of 200 or so on each of x,y,z), into a 4-byte pixel color value?
Storing the magnitude in the 4th component sounds very reasonable. As long as the magnitude is bounded to something reasonable and not completely arbitrary.
If you want a more flexible range of magnitudes you can pre-multiply the normalized direction vector by (0.5, 1.0] when you store it, and when you unpack it multiply it by pow(2, w).
Such method is used for storing high dynamic range images - RGBM encoding (M stands for magnitude). One of it's drawbacks is wrong results from interpolation so you can't use bilinear filtering for your texture.
You can look for other options from HDR encodings: here is a small list of few most popular

Is bilinear filtering reversible?

When using a bilinear filter to magnify an image (by some non-integer factor), is that process lossless? That is, is there some way to calculate the original image, as long as the original resolution, the upscaled image and the exact algorithm used are known, and there is no loss in precision when upscaling (no rounding errors)?
My guess would be that it is, but that is based on some calculations on a napkin regarding the one-dimensional case only.
Taking the 1D case as a simplification. Each output point can be expressed as a linear combination of two of the input points, i.e.:
y_n = k_n * x_m + (1-k_n) * x_{m+1}
You have a whole set of these equations, which can be expressed in vector notation as:
Y = K * X
where X is a length-M vector of input points, Y is a length-N vector of output points, and K is a sparse matrix (size NxM) containing the (known) values of k.
For the interpolation to be reversible, K must be an invertible matrix. This means that there must be at least M linearly-independent rows. This is true if and only if there is at least one output point in-between each pair of input points.

Transforming a 3D plane using a 4x4 matrix

I have a shape made out of several triangles which is positioned somewhere in world space with scale, rotate, translate. I also have a plane on which I would like to project (orthogonal) the shape.
I could multiply every vertex of every triangle in the shape with the objects transformation matrix to find out where it is located in world coordinates, and then project this point onto the plane.
But I don't need to draw the projection, and instead I would like to transform the plane with the inverse transformation matrix of the shape, and then project all the vertices onto the (inverse transformed) plane. Since it only requires me to transform the plane once and not every vertex.
My plane has a normal (xyz) and a distance (d). How do I multiply it with a 4x4 transformation matrix so that it turns out ok?
Can you create a vec4 as xyzd and multiply that? Or maybe create a vector xyz1 and then what to do with d?
You need to convert your plane to a different representation. One where N is the normal, and O is any point on the plane. The normal you already know, it's your (xyz). A point on the plane is also easy, it's your normal N times your distance d.
Transform O by the 4x4 matrix in the normal way, this becomes your new O. You will need a Vector4 to multiply with a 4x4 matrix, set the W component to 1 (x, y, z, 1).
Also transform N by the 4x4 matrix, but set the W component to 0 (x, y, z, 0). Setting the W component to 0 means that your normals won't get translated. If your matrix is composed of more that just translating and rotating, then this step isn't so simple. Instead of multiplying by your transformation matrix, you have to multiply by the transpose of the inverse of the matrix i.e. Matrix4.Transpose(Matrix4.Invert(Transform)), there's a good explanation on why here.
You now have a new normal vector N and a new position vector O. However I suppose you want it in xyzd form again? No problem. As before, xyz is your normal N all that's left is to calculate d. d is the distance of the plane from the origin, along the normal vector. Hence, it is simply the dot product of O and N.
There you have it! If you tell me what language you're doing this in, I'd happily type it up in code as well.
EDIT, In pseudocode:
The plane is vector3 xyz and number d, the matrix is a matrix4x4 M
vector4 O = (xyz * d, 1)
vector4 N = (xyz, 0)
O = M * O
N = transpose(invert(M)) * N
xyz = N.xyz
d = dot(O.xyz, N.xyz)
xyz and d represent the new plane
This question is a bit old but I would like to correct the accepted answer.
You do not need to convert your plane representation.
Any point lies on the plane if
It can be written as dot product :
You are looking for the plane transformed by your 4x4 matrix .
For the same reason, you must have
So and with some arrangements
TLDR : if p=(a,b,c,d), p' = transpose(inverse(M))*p
Notation:
n is a normal represented as a (1x3) row-vector
n' is the transformed normal of n according to transform matrix T
(n|d) is a plane represented as a (1x4) row-vector (with n the plane's normal and d the plane's distance to the origin)
(n'|d') is the transformed plane of (n|d) according to transform matrix T
T is a (4x4) (affine) column-major transformation matrix (i.e. transforming a column-vector t is defined as t' = T t).
Transforming a normal n:
n' = n adj(T)
Transforming a plane (n|d):
(n'|d') = (n|d) adj(T)
Here, adj is the adjugate of a matrix which is defined as follows in terms of the inverse and determinant of a matrix:
T^-1 = adj(T)/det(T)
Note:
The adjugate is generally not equal to the inverse of a transformation matrix T. If T includes a reflection, det(T) = -1, reversing the winding order!
Re-normalizing n' is mathematically not required (but maybe numerically depending on the implementation) since scaling is taken care off by the determinant. Thanks to Adrian Leonhard.
You can directly transform the plane without first decomposing and recomposing a plane (normal and point).

Can good type systems distinguish between matrices in different bases?

My program (Hartree-Fock/iterative SCF) has two matrices F and F' which are really the same matrix expressed in two different bases. I just lost three hours of debugging time because I accidentally used F' instead of F. In C++, the type-checker doesn't catch this kind of error because both variables are Eigen::Matrix<double, 2, 2> objects.
I was wondering, for the Haskell/ML/etc. people, whether if you were writing this program you would have constructed a type system where F and F' had different types? What would that look like? I'm basically trying to get an idea how I can outsource some logic errors onto the type checker.
Edit: The basis of a matrix is like the unit. You can say 1L or however many gallons, they both mean the same thing. Or, to give a vector example, you can say (0,1) in Cartesian coordinates or (1,pi/2) in polar. But even though the meaning is the same, the numerical values are different.
Edit: Maybe units was the wrong analogy. I'm not looking for some kind of record type where I can specify that the first field will be litres and the second gallons, but rather a way to say that this matrix as a whole, is defined in terms of some other matrix (the basis), where the basis could be any matrix of the same dimensions. E.g., the constructor would look something like mkMatrix [[1, 2], [3, 4]] [[5, 6], [7, 8]] and then adding that object to another matrix would type-check only if both objects had the same matrix as their second parameters. Does that make sense?
Edit: definition on Wikipedia, worked examples
This is entirely possible in Haskell.
Statically checked dimensions
Haskell has arrays with statically checked dimensions, where the dimensions can be manipulated and checked statically, preventing indexing into the wrong dimension. Some examples:
This will only work on 2-D arrays:
multiplyMM :: Array DIM2 Double -> Array DIM2 Double -> Array DIM2 Double
An example from repa should give you a sense. Here, taking a diagonal requires a 2D array, returns a 1D array of the same type.
diagonal :: Array DIM2 e -> Array DIM1 e
or, from Matt sottile's repa tutorial, statically checked dimensions on a 3D matrix transform:
f :: Array DIM3 Double -> Array DIM2 Double
f u =
let slabX = (Z:.All:.All:.(0::Int))
slabY = (Z:.All:.All:.(1::Int))
u' = (slice u slabX) * (slice u slabX) +
(slice u slabY) * (slice u slabY)
in
R.map sqrt u'
Statically checked units
Another example from outside of matrix programming: statically checked units of dimension, making it a type error to confuse e.g. feet and meters, without doing the conversion.
Prelude> 3 *~ foot + 1 *~ metre
1.9144 m
or for a whole suite of SI units and quanities.
E.g. can't add things of different dimension, such as volumes and lengths:
> 1 *~ centi litre + 2 *~ inch
Error:
Expected type: Unit DVolume a1
Actual type: Unit DLength a0
So, following the repa-style array dimension types, I'd suggest adding a Base phantom type parameter to your array type, and using that to distinguish between bases. In Haskell, the index Dim
type argument gives the rank of the array (i.e. its shape), and you could do similarly.
Or, if by base you mean some dimension on the units, using dimensional types.
So, yep, this is almost a commodity technique in Haskell now, and there's some examples of designing with types like this to help you get started.
This is a very good question. I don't think you can encode the notion of a basis in most type systems, because essentially anything that the type checker does needs to be able to terminate, and making judgments about whether two real-valued vectors are equal is too difficult. You could have (2 v_1) + (2 v_2) or 2 (v_1 + v_2), for example. There are some languages which use dependent types [ wikipedia ], but these are relatively academic.
I think most of your debugging pain would be alleviated if you simply encoded the bases in which you matrix works along with the matrix. For example,
newtype Matrix = Matrix { transform :: [[Double]],
srcbasis :: [Double], dstbasis :: [Double] }
and then, when you M from basis a to b with N, check that N is from b to c, and return a matrix with basis a to c.
NOTE -- it seems most people here have programming instead of math background, so I'll provide short explanation here. Matrices are encodings of linear transformations between vector spaces. For example, if you're encoding a rotation by 45 degrees in R^2 (2-dimensional reals), then the standard way of encoding this in a matrix is saying that the standard basis vector e_1, written "[1, 0]", is sent to a combination of e_1 and e_2, namely [1/sqrt(2), 1/sqrt(2)]. The point is that you can encode the same rotation by saying where different vectors go, for example, you could say where you're sending [1,1] and [1,-1] instead of e_1=[1,0] and e_2=[0,1], and this would have a different matrix representation.
Edit 1
If you have a finite set of bases you are working with, you can do it...
{-# LANGUAGE EmptyDataDecls #-}
data BasisA
data BasisB
data BasisC
newtype Matrix a b = Matrix { coefficients :: [[Double]] }
multiply :: Matrix a b -> Matrix b c -> Matrix a c
multiply (Matrix a_coeff) (Matrix b_coeff) = (Matrix multiplied) :: Matrix a c
where multiplied = undefined -- your algorithm here
Then, in ghci (the interactive Haskell interpreter),
*Matrix> let m = Matrix [[1, 2], [3, 4]] :: Matrix BasisA BasisB
*Matrix> m `multiply` m
<interactive>:1:13:
Couldn't match expected type `BasisB'
against inferred type `BasisA'
*Matrix> let m2 = Matrix [[1, 2], [3, 4]] :: Matrix BasisB BasisC
*Matrix> m `multiply` m2
-- works after you finish defining show and the multiplication algorithm
While I realize this does not strictly address the (clarified) question – my apologies – it seems relevant at least in relation to Don Stewart's popular answer...
I am the author of the Haskell dimensional library that Don referenced and provided examples from. I have also been writing – somewhat under the radar – an experimental rudimentary linear algebra library based on dimensional. This linear algebra library statically tracks the sizes of vectors and matrices as well as the physical dimensions ("units") of their elements on a per element basis.
This last point – tracking physical dimensions on a per element basis – is rather challenging and perhaps overkill for most uses, and one could even argue that it makes little mathematical sense to have quantities of different physical dimensions as elements in any given vector/matrix. However, some linear algebra applications of interest to me such as kalman filtering and weighted least squares estimation typically use heterogeneous state vectors and covariance matrices.
Using a Kalman filter as an example, consider a state vector x = [d, v] which has physical dimensions [L, LT^-1]. The next (future) state vector is predicted by multiplication by the state transition matrix F, i.e.: x' = F x_. Clearly for this equation to make sense F cannot be arbitrary but must have size and physical dimensions [[1, T], [T^-1, 1]]. The predict_x' function below statically ensures that this relationship holds:
predict_x' :: (Num a, MatrixVector f x x) => Mat f a -> Vec x a -> Vec x a
predict_x' f x_ = f |*< x_
(The unsightly operator |*< denotes multiplication of a matrix on the left with a vector on the right.)
More generally, for an a priori state vector x_ of arbitrary size and with elements of arbitrary physical dimensions, passing a state transition matrix f with "incompatible" size and/or physical dimensions to predict_x' will cause a compile time error.
In F# (which originally evolved from OCaml), you can use units of measure. Andrew Kenned, who designed the feature (and also created a very interesting theory behind it) has a great series of articles that demonstrate it.
This can quite likely be used in your scenario - although I don't fully understand the question. For example, you can declare two unit types like this:
[<Measure>] type litre
[<Measure>] type gallon
Adding litres and gallons gives you a compile time error:
1.0<litre> + 1.0<gallon> // Error!
F# doesn't automatically insert conversion between different units, but you can write a conversion function:
let toLitres gal = gal * 3.78541178<litre/gallon>
1.0<litre> + (toLitres 1.0<gallon>)
The beautiful thing about units of measure in F# is that they are automatically inferred and functions are generic. If you multiply 1.0<gallon> * 1.0<gallon>, the result is 1.0<gallon^2>.
People have used this feature for various things - ranging from conversion of virtual meters to screen pixels (in solar system simulations) to converting currencies (dollars in financial systems). Although I'm not expert, it is quite likely that you could use it in some way for your problem domain too.
If it's expressed in a different base, you can just add a template parameter to act as the base. That will differentiate those types. A float is a float is a float- if you don't want two float values to be the same if they actually have the same value, then you need to tell the type system about it.

Resources